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ANGIOGENESIS MODULATING COMPOUNDS 
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TECHNICAL FIELD 



This invention is in the field of molecular biology and medicine. More specifically, 
it relates to novel vector constructs, compositions, and methods of use thereof for screening 
15 compounds in host cells or transgenic animals. Further, the invention relates to vector 
constructs and methods of use thereof to generate transgenic organisms, particularly 
transgenic mice. 



exchange in most tissues and organs is the establishment of a vascular supply. Several 
processes for blood vessel development and differentiation have been identified. One such 
process is termed "vasculogenesis" and takes place in the embryo. This process consists of 
the in situ differentiation of mesenchymal cells into hemoangioblasts, which are the 

25 precursors of both endothelial cells and blood cells. "Angiogenesis" is a second such process 
and involves the formation of new blood vessels from a preexisting endothelium. This 
process is required for (i) the development of embryonic vasculature, and (ii) a variety of 
post-natal processes, including, but not limited to, wound healing, tissue regeneration, and 
organ regeneration. Further, angiogenesis has been identified as a requirement for solid 

30 tumor growth and uncontrolled blood cell proliferation. 

Vascular Endothelial Growth Factor (VEGF; also designated as vascular 
permeability factor (VPF) has been identified as a regulator of normal and pathological 
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A requirement for cellular inflow of nutrients, outflow of waste products, and gas 
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angiogenesis. VEGF is a secreted growth factor having the following properties: (i) an 
endothelial cell specific mitogen; (ii) angiogenic in vivo and induces vascular permeability; 
(iii) VEGF expression (and expression of its receptors) has been correlated with 
vasculogenesis and angiogenesis during embryonic development; and (iv) VEGF is 

5 expressed in tumor cells. The VEGF receptor appears to be expressed exclusively in 

adjacent small blood vessels. VEGF appears to play a crucial role in the vascularization of a 
wide range of tumors including, but not limited to, breast cancers, ovarian tumors, brain 
tumors, kidney and bladder carcinomas, adenocarcinomas and malignant gliomas. Tumors 
have been shown to produce ample amounts of VEGF which stimulates the proliferation and 

10 migration of endothelial cells (ECs). This is thought to induce tumor vascularization by a 
paracrine mechanism. 

The angiogenic effect of VEGF appears to be mediated by its binding to high 
affinity cell surface VEGF receptors (VEGFR). 

15 SUMMARY OF THE INVENTION 

The invention provides methods and compositions relating to the VEGFR-2 gene 
transcriptional promoter. The compositions include recombinant regulators of 
gene expression comprising the VEGFR-2 promoter of SEQ ID NO:32, further sequences 
isolated based on the teachings disclosed herein, or deletion mutants thereof, typically at 

20 least 10, 20, 25, 50, or 100 bp in length, where the sequences maintain cis transcriptional 
regulatory activity. 

The invention also provides hybridization probes and replication/amplification 
primers having a hitherto novel VEGFR-2 specific sequence contained in SEQ ID NO:32 
(including its complement and analogs and complements thereof having the 
25 corresponding sequence, e.g., in RNA) and sufficient to effect specific 

hybridization thereto (i.e., specifically hybridize with SEQ ID NO:32 in the 
presence of genomic DNA). 

The invention also provides cells and vectors comprising the disclosed VEGFR-2 
regulators, including cells comprising such regulators operably linked to 
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non-VEGFR-2 coding sequences (i.e., a heterologous coding sequence). Such cells find use 
in the disclosed methods for identifying agents or compounds that regulate the activity of a 
VEGFR-2 promoter. In an exemplary method, the cells are contacted with a candidate agent, 
under conditions wherein, but for the presence of said agent, the VEGFR-2 promoter 

5 exhibits a first expression of a reporter; detecting the presence of a second expression of the 
reporter, wherein a difference between said first and said second expression of the reporter 
indicates that the agent affects the expression mediated by the VEGFR-2 promoter. 

In another aspect, this invention relates to a substantially purified nucleic acid 
molecule comprising a VEGFR-2 promoter region (i.e., an isolated polynucleotide). In one 

10 embodiment the isolated polynucleotide comprises the sequence presented as SEQ ID 
NO:32, and fragments thereof which maintain cis-acting transcriptional activity, in 
particular, a regulator of gene expression derived from SEQ ID NO:32 wherein said 
polynucleotide sequence has cis transcriptional regulatory activity. A further embodiment 
includes, an isolated polynucleotide comprising, a cis-acting transcription regulator having 

15 X contiguous nucleotides, wherein (i) the X contiguous nucleotides have at least about 90% 
identity to Y contiguous nucleotides derived from SEQ ID NO:32, (ii) X equals Y, and (iii) 
X is greater than or equal to 50. Exemplary values of X include, but are not limited to, X is 
greater than or equal to 500, X is greater than or equal to 3563, X is in the range of 50-3570 
including all integer values in that range. 

20 The invention also includes an expression cassette comprising the above-described 

polynucleotides. The invention also relates to expression vectors comprising the 
aforementioned polynucleotide sequences and host cells transformed with these expression 
vectors. 

In another aspect, the invention also relates to methods for detecting test agents 
25 which modulate transcription of the VEGFR-2 promoters described above. Such methods 
include contacting a host cell transformed with an expression vector comprising the 
VEGFR-2 promoter DNA sequence operably linked to a reporter sequence with the test 
agent and comparing the level of transcription produced by the test agent to the 
level of transcription produced in its absence. 
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The invention also relates to transgenic or chimeric animals whose cells express 

a heterologous gene under the transcriptional control of a VEGFR-2 promoter, and methods 

of using such animals as described herein. 

In another aspect, the invention relates to the above embodiments wherein the 
5 promoter sequence is derived from Tie2. 

These and other embodiments of the present invention will be apparent to those of 
skill in the art in view of the teachings herein. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 Figure 1 is a schematic depicting construction of the pTK53 vector. Polynucleotides 

encoding PGK-P, Neo and TK and 5' and 3' linkers are introduced into a pKS backbone to 
produce the vector designated pTK53. 

Figure 2 is schematic depicting construction of the pTK-LucR and pTK-LucYG 
vectors. For pTK-LucR, a polynucleotide encoding LucR is introduced into pTK53. Thus, 

15 the pTK-LucR construct contains the PGK-P gene, a neomycin (Neo r ) gene, a thymidine 
kinase (TK) gene and sequence encoding red luciferase (Luc-R). For pTK-LucYG, a 
polynucleotide encoding LucYG is introduced into pTK53. Thus, the pTK-LucYG 
construct contains the PGK-P gene, a neomycin (Neo r ) gene, a thymidine kinase (TK) gene 
and a sequence encoding yellow-green luciferase (Luc-YG). 

20 Figures 3A is a schematic depicting the vector pTKLR-Vn. Sequences homologous 

to the vitronectin gene are inserted into pTK-LucR such that they flank the Neo r gene and 
the Luc-R coding sequence. Figure 3B is a schematic depicting targeting of the linearized 
pTKLR-Vn vector to. the vintronectin chromosomal locus. The VEGF promoter is cloned 
into the polylinkers between Neo and Luc-R. Upon homologous recombination, the Neo- 

25 VEGF-LucR transgene is inserted into the Vn gene. In the figure, (A) shows the targeting 
vector pTKLR-Vn and (B) shows the mouse vitronectin gene. In the figure, Neo - 
neomycin resistance encoding sequences; TK - thymidine kinase encoding sequences; LucR 
- red luciferase from pGL3Red (Dr. Christopher Contag, Stanford University, Stanford, 
CA). Regions bearing Vn gene translational start and stop codons are indicated with arrows. 
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Poly(A) sequences are placed upstream of the polylinker to prevent or minimize read- 
through translation. Figure 3C shows the nucleotide sequence of vitronectin. 

Figure 4A is a schematic depicting the vector pTKLG-Fos. Sequences homologous 
to the FosB gene are inserted into pTK-LucYG such that they flank the Neo r gene and the 
5 Luc-YG coding sequence. Figure 4B shows the nucleotide sequence of FosB. 

Figures 5A, 5B, and 5C depicts the nucleotide sequence of the entire promoter region 
of the VEGFR2 mouse gene (SEQ ID NO:32). 

Figure 6 depicts PCR conditions for genomic screening for promoters useful in 
exemplary targeting constructs of the present invention. 
10 Figure 7 depicts generation of targeted transgenic mice using the targeting vectors 

described herein. 

Figure 8 depicts of schematic representation of Southern blot analysis of homologous 
DNA recombination between pTKLG-Fos targetting vector and the FosB gene. 

Figure 9 depicts generation of targeted transgenic mice, using the targeting vectors 
15 described herein, and crosses using such transgenics as well as their offspring (Fl, first 
generation; F2, second generation). 

Figure 10 depicts crosses using transgenic mice of the present invention to generate 
dual luciferase transgenic mice. 

Figure 1 1 depicts the nucleotide sequence of a 51 1 bp enhancer region of VEGFR2 
20 (SEQ ID NO:35). 

Figure 12 is a schematic depicting engineering of the pGL3B2-KPN construct. 
PGL3B (Promerga, Madison, WI) contains the yellow-green luciferase gene (Luc-YG). The 
construct contains a 4.5 kb fragment of the VEGFR2 promoter and a 0.5 kb fragment of the 
VEGFR2 enhancer. 

25 Figure 13 is a schematic depicting engineering of the pTKLG-Fos-KPN construct 

made using pGL3B2-KPN (Figure 12) and pTKLG-Fos. 

Figure 14 is a schematic depicting targeting of the linearized pTKLG-Fos vector to 
the FosB chromosomal locus. The VEGFR2 promoter is cloned into the polylinkers 
between Neo and Luc-YG. Upon homologous recombination, the Neo-VEGFR2-LucYG 
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transgene will be inserted into a sequence associated with production of FosB. In the figure, 
(A) shows the targeting vector, and (B) shows the mouse target gene. In the figure, Neo - 
neomycin resistance encoding sequences; TK - thymidine kinase encoding sequences; 
LucYG - yellow green luciferase from pGL3-control vector (Promega, Madison, WI). 
5 Regions bearing FosB gene translational start and stop codons are indicated with arrows. 
Poly( A) sequences are placed upstream of the polylinker to prevent or minimize read- 
through translation. 

Figure 15 depicts the nucleotide sequence of the entire promoter region of the Tie2 
mouse gene (SEQ ID NO:40). 
10 Figure 16 depicts the nucleotide sequence of a 1 .7 kb enhancer region of Tie2 (SEQ 

IDNO:41). 

Figure 17 is a schematic depicting engineering of the pGL3B2-TPN construct. 
PGL3B (Promerga, Madison, WI) contains the yellow-green luciferase gene (Luc-YG). The 
construct contains a 7.1 kb fragment of the Tie2 promoter and a 1.7 kb fragment of the Tie2 
15 enhancer. 

Figure 18 is a schematic depicting engineering of the pTKLG-Fos-KPN construct 
made using pGL3B2-KPN (Figure 17) and pTKLG-Fos. 

Figure 19 is a schematic depicting targeting of the linearized pTKLG-Fos vector to 
the FosB chromosomal locus. The TIE2 promoter is cloned into the polylinkers between 

20 Neo and Luc-R. Upon homologous recombination, the Neo-Tie2-LucYG transgene is 

inserted into the FosB gene. In the figure, (A) shows the targeting vector, and (B) shows the 
mouse target gene. In the figure, Neo - neomycin resistance encoding sequences; TK - 
thymidine kinase encoding sequences; LucYG - yellow green luciferase from pGL3-control 
vector (Promega). Regions bearing FosB gene translational start and stop codons are 

25 indicated with arrows. Poly(A) sequences are placed upstream of the polylinker to prevent 
or minimize read-through translation. 
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MODES FOR CARRYING OUT THE INVENTION 
Throughout this application, various publications, patents, and published patent 
applications are referred to by an identifying citation. The disclosures of these publications, 
patents, and published patent specifications referenced in this application are hereby 
5 incorporated by reference into the present disclosure to more fully describe the state of the 
art to which this invention pertains. 

The practice of the present invention will employ, unless otherwise indicated, 
conventional techniques of molecular biology, microbiology, cell biology and recombinant 
DNA, which are within the skill of the art. See, e.g., Sambrook, Fritsch, and Maniatis, 
10 MOLECULAR CLONING: A LABORATORY MANUAL, 2nd edition (1989); CURRENT 
PROTOCOLS IN MOLECULAR BIOLOGY, (F.M. Ausubel et al. eds., 1987); the series 
METHODS IN ENZYMOLOGY (Academic Press, Inc.); PCR 2: A PRACTICAL 
APPROACH (M.J. McPherson, B.D. Hames and G.R. Taylor eds., 1995) and ANIMAL 
CELL CULTURE (R.I. Freshney. Ed., 1987). 
15 All publications, patents and patent applications cited herein, whether supra or infra, 

are hereby incorporated by reference in their entirety. 

As used in this specification and the appended claims, the singular forms "a," "an" 
and "the" include plural references unless the content clearly dictates otherwise. Thus, for 
example, reference to "an antigen" includes a mixture of two or more such agents. 

20 

Definitions 

As used herein, certain terms will have specific meanings. 

The terms "nucleic acid molecule" and "polynucleotide" are used interchangeably to 
and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or 
25 ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional 
structure, and may perform any function, known or unknown. Non-limiting examples of 
polynucleotides include a gene, a gene fragment, exons, introns, messenger RNA (mRNA), 
transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched 
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polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any 
sequence, nucleic acid probes, and primers. 

A polynucleotide is typically composed of a specific sequence of four nucleotide 
bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) 
5 when the polynucleotide is RNA). Thus, the term polynucleotide sequence is the 

alphabetical representation of a polynucleotide molecule. This alphabetical representation 
can be input into databases in a computer having a central processing unit and used for 
bioinformatics applications such as functional genomics and homology searching. 

A "coding sequence" or a sequence which "encodes" a selected polypeptide, is a 
10 nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the case 
of mRNA) into a polypeptide, for example, in vivo when placed under the control of 
appropriate regulatory sequences (or "control elements")- The boundaries of the coding 
sequence are typically determined by a start codon at the 5' (amino) terminus and a 
translation stop codon at the 3' (carboxy) terminus. A coding sequence can include, but is 
15 not limited to, cDNA from viral, procaryotic or eucaryotic mRNA, genomic DNA sequences 
; ' y from viral or procaryotic DNA, and even synthetic DNA sequences. A transcription 

Ill termination sequence may be located 3' to the coding sequence. Other "control elements" 

may also be associated with a coding sequence. A DNA sequence encoding a polypeptide 
can be optimized for expression in a selected cell by using the codons preferred by the 
20 selected cell to represent the DNA copy of the desired polypeptide coding sequence. 

"Encoded by" refers to a nucleic acid sequence which codes for a polypeptide sequence, 
wherein the polypeptide sequence or a portion thereof contains an amino acid sequence of at 
least 3 to 5 amino acids, more preferably at least 8 to 10 amino acids, and even more 
preferably at least 15 to 20 amino acids from a polypeptide encoded by the nucleic acid 
25 sequence. Also encompassed are polypeptide sequences which are immunologically 
identifiable with a polypeptide encoded by the sequence. 

A "transcription factor" as used herein typically refers to a protein (or polypeptide) 
which affects the transcription, and accordingly the expression, of a specified gene. A 
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transcription factor may refer to a single polypeptide transcription factor, one or more 
polypeptides acting sequentially or in concert, or a complex of polypeptides. 

Typical "control elements", include, but are not limited to, transcription promoters, 
transcription enhancer elements, transcription regulating elements (transcription regulators), 

5 transcription termination signals, polyadenylation sequences (located 3' to the translation 
stop codon), sequences for optimization of initiation of translation (located 5' to the coding 
sequence), translation enhancing sequences, and translation termination sequences. 
Transcription promoters can include inducible promoters (where expression of a 
polynucleotide sequence operably linked to the promoter is induced by an analyte, cofactor, 

10 regulatory protein, etc.), repressible promoters (where expression of a polynucleotide 
sequence operably linked to the promoter is induced by an analyte, cofactor, regulatory 
protein, etc.), and constitutive promoters. A transcription regulator is a cis-acting element 
that affects the transcription of a gene, for example, a region of a promoter with which a 
transcription factor interacts to induce expression of a gene. 

15 "Expression enhancing sequences" typically refer to control elements that improve 

transcription or translation of a polynucleotide relative to the expression level in the absence 
of such control elements (for example, promoters, promoter enhancers, enhancer elements, 
and translational enhancers (e.g., Shine and Delagarno sequences)). 

"Purified polynucleotide" refers to a polynucleotide of interest or fragment thereof 

20 which is essentially free, e.g., contains less than about 50%, preferably less than about 70%, 
and more preferably less than about 90%, of the protein with which the polynucleotide is 
naturally associated. Techniques for purifying polynucleotides of interest are well-known in 
the art and include, for example, disruption of the cell containing the polynucleotide with a 
chaotropic agent and separation of the polynucleotide(s) and proteins by ion-exchange 

25 chromatography, affinity chromatography and sedimentation according to density. 

A "heterologous sequence" as used herein is typically refers to either (i) a nucleic 
acid sequence that is not normally found in the cell or organism of interest, or (ii) a nucleic 
acid sequence introduced at a genomic site wherein the nucleic acid sequence does not 
normally occur in nature at that site. For example, a DNA sequence encoding a polypetide 
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can be obtained from yeast and introduced into a bacterial cell. In this case the yeast DNA 
sequence is "heterologous" to the native DNA of the bacterial cell. Alternatively, a 
promoter sequence from a Tie2 gene can be introduced into the genomic location of &fosB 
gene. In this case the Tie2 promoter sequence is "heterologous" to the native fosB genomic 
5 sequence. 

A "polypeptide" is used in it broadest sense to refer to a compound of two or more 
subunit amino acids, amino acid analogs, or other peptidomimetics. The subunits may be 
linked by peptide bonds or by other bonds, for example ester, ether, etc. As used herein, the 
term "amino acid" refers to either natural and/or unnatural or synthetic amino acids, 

10 including glycine and both the D or L optical isomers, and amino acid analogs and 
peptidomimetics. A peptide of three or more amino acids is commonly called an 
oligopeptide if the peptide chain is short. If the peptide chain is long, the peptide is typically 
called a polypeptide or a protein. 

"Operably linked" refers to an arrangement of elements wherein the components so 

15 described are configured so as to perform their usual function. Thus, a given promoter that 
is operably linked to a coding sequence (e.g., a reporter expression cassette) is capable of 
effecting the expression of the coding sequence when the proper enzymes are present. The 
promoter or other control elements need not be contiguous with the coding sequence, so 
long as they function to direct the expression thereof. For example, intervening untranslated 

20 yet transcribed sequences can be present between the promoter sequence and the coding 
sequence and the promoter sequence can still be considered "operably linked" to the coding 
sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its 
25 origin or manipulation: (1) is not associated with all or a portion of the polynucleotide with 
which it is associated in nature; and/or (2) is linked to a polynucleotide other than that to 
which it is linked in nature. The term "recombinant" as used with respect to a protein or 
polypeptide means a polypeptide produced by expression of a recombinant polynucleotide. 
"Recombinant host cells," "host cells," "cells," "cell lines," "cell cultures," and other such 
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terms denoting procaryotic microorganisms or eucaryotic cell lines cultured as unicellular 
entities, are used interchangeably, and refer to cells which can be, or have been, used as 
recipients for recombinant vectors or other transfer DNA, and include the progeny of the 
original cell which has been transfected. It is understood that the progeny of a single 

5 parental cell may not necessarily be completely identical in morphology or in genomic or 
total DNA complement to the original parent, due to accidental or deliberate mutation. 
Progeny of the parental cell which are sufficiently similar to the parent to be characterized 
by the relevant property, such as the presence of a nucleotide sequence encoding a desired 
peptide, are included in the progeny intended by this definition, and are covered by the 

10 above terms. 

An "isolated polynucleotide" molecule is a nucleic acid molecule separate and 
discrete from the whole organism with which the molecule is found in nature; or a nucleic 
acid molecule devoid, in whole or part, of sequences normally associated with it in nature; 
or a sequence, as it exists in nature, but having heterologous sequences (as defined below) in 

15 association therewith. 

Techniques for determining nucleic acid and amino acid "sequence identity" also are 
known in the art. Typically, such techniques include determining the nucleotide sequence of 
the mRNA for a gene and/or determining the amino acid sequence encoded thereby, and 
comparing these sequences to a second nucleotide or amino acid sequence. In general, 

20 "identity" refers to an exact nucleotide-to-nucleotide or amino acid-to-amino acid 

correspondence of two polynucleotides or polypeptide sequences, respectively. Two or 
more sequences (polynucleotide or amino acid) can be compared by determining their 
"percent identity." The percent identity of two sequences, whether nucleic acid or amino 
acid sequences, is the number of exact matches between two aligned sequences divided by 

25 the length of the shorter sequences and multiplied by 100. An approximate alignment for 
nucleic acid sequences is provided by the local homology algorithm of Smith and 
Waterman, Advances in Applied Mathematics 2:482-489 (1981). This algorithm can be 
applied to amino acid sequences by using the scoring matrix developed by Dayhoff, Atlas of 
Protein Sequences and Structure , M.O. Dayhoff ed., 5 suppl. 3:353-358, National 
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Biomedical Research Foundation, Washington, D.C., USA, and normalized by Gribskov, 
Nucl. Acids Res. 14(6):6745-6763 (1986). An exemplary implementation of this algorithm 
to determine percent identity of a sequence is provided by the Genetics Computer Group 
(Madison, WI) in the "BestFit" utility application. The default parameters for this method 

5 are described in the Wisconsin Sequence Analysis Package Program Manual, Version 8 
(1995) (available from Genetics Computer Group, Madison, WI). A preferred method of 
establishing percent identity in the context of the present invention is to use the MPSRCH 
package of programs copyrighted by the University of Edinburgh, developed by John F. 
Collins and Shane S. Sturrok, and distributed by MelliGenetics, Inc. (Mountain View, CA). 

10 From this suite of packages the Smith-Waterman algorithm can be employed where default 
parameters are used for the scoring table (for example, gap open penalty of 12, gap 
extension penalty of one, and a gap of six). From the data generated the "Match" value 
reflects "sequence identity." Other suitable programs for calculating the percent identity or 
similarity between sequences are generally known in the art, for example, another alignment 

15 program is BLAST, used with default parameters. For example, BLASTN and BLASTP 
can be used using the following default parameters: genetic code = standard; filter = none; 
strand = both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 sequences; 
sort by = HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + 
GenBank CDS translations + Swiss protein + Spupdate + PIR. Details of these programs 

20 can be found at the following internet address: http://www.ncbi.nlm.gov/cgi-bin/BLAST. 
When claiming sequences relative to sequences of the present invention, the range of desired 
degrees of sequence identity is approximately 80% to 100% and integer values 
therebetween. Typically the percent identities between the disclosed sequences and the 
claimed sequences are at least 80-82%, 85-90%, preferably 92%, more preferably 95%, and 

25 even more preferably 98% sequence identity to the reference sequence (i.e., the sequences of 
the present invention). 

Alternatively, the degree of sequence similarity between polynucleotides can be 
determined by hybridization of polynucleotides under conditions that form stable duplexes 
between homologous regions, followed by digestion with single-stranded-specific 
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nuclease(s), and size determination of the digested fragments. Two DNA, or two 
polypeptide sequences are "substantially homologous" to each other when the sequences 
exhibit at least about 80%-85%, preferably at least about 85%-90%, more preferably at least 
about 90%-95%, and most preferably at least about 95%-98% sequence identity over a 

5 defined length of the molecules, as determined using the methods above. As used herein, 
substantially homologous also refers to sequences showing complete identity to the specified 
DNA or polypeptide sequence. DNA sequences that are substantially homologous can be 
identified in a Southern hybridization experiment under, for example, stringent conditions, 
as defined for that particular system. Defining appropriate hybridization conditions is within 

10 the skill of the art. See, e.g., Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid 
Hybridization, supra. 

Two nucleic acid fragments are considered to "selectively hybridize" as described 
herein. The degree of sequence identity between two nucleic acid molecules affects the 
efficiency and strength of hybridization events between such molecules. A partially 

15 identical nucleic acid sequence will at least partially inhibit a completely identical sequence 
from hybridizing to a target molecule. Inhibition of hybridization of the completely 
identical sequence can be assessed using hybridization assays that are well known in the art 
(e.g., Southern blot, Northern blot, solution hybridization, or the like, see Sambrook, et al., 
Molecular Cloning: A Laboratory Manual, Second Edition, (1989) Cold Spring Harbor, 

20 N.Y.). Such assays can be conducted using varying degrees of selectivity, for example, 
using conditions varying from low to high stringency. If conditions of low stringency are 
employed, the absence of non-specific binding can be assessed using a secondary probe that 
lacks even a partial degree of sequence identity (for example, a probe having less than about 
30% sequence identity with the target molecule), such that, in the absence of non-specific 

25 binding events, the secondary probe will not hybridize to the target. 

When utilizing a hybridization-based detection system, a nucleic acid probe is 
chosen that is complementary to a target nucleic acid sequence, and then by selection of 
appropriate conditions the probe and the target sequence "selectively hybridize," or bind, to 
each other to form a hybrid molecule. A nucleic acid molecule that is capable of hybridizing 



PATENT 
PXE-012.US 

14 

selectively to a target sequence under "moderately stringent" typically hybridizes under 
conditions that allow detection of a target nucleic acid sequence of at least about 10-14 
nucleotides in length having at least approximately 70% sequence identity with the sequence 
of the selected nucleic acid probe. Stringent hybridization conditions typically allow 
5 detection of target nucleic acid sequences of at least about 10-14 nucleotides in length 
having a sequence identity of greater than about 90-95% with the sequence of the selected 
nucleic acid probe. Hybridization conditions useful for probe/target hybridization where the 
probe and target have a specific degree of sequence identity, can be determined as is known 
in the art (see, for example, Nucleic Acid Hybridization: A Practical Approach , editors B.D. 

10 Hames and S.J. Higgins, (1985) Oxford; Washington, DC; IRL Press). 

With respect to stringency conditions for hybridization, it is well known in the art 
that numerous equivalent conditions can be employed to establish a particular stringency by 
varying, for example, the following factors: the length and nature of probe and target 
sequences, base composition of the various sequences, concentrations of salts and other 

15 hybridization solution components, the presence or absence of blocking agents in the 
hybridization solutions (e.g., formamide, dextran sulfate, and polyethylene glycol), 
hybridization reaction temperature and time parameters, as well as, varying wash conditions. 
The selection of a particular set of hybridization conditions is selected following standard 
methods in the art (see, for example, Sambrook, et al., Molecular Cloning: A Laboratory 

20 Manual Second Edition, (1989) Cold Spring Harbor, N.Y.). 

A "vector" is capable of transferring gene sequences to target cells. Typically, 
"vector construct," "expression vector," and "gene transfer vector," mean any nucleic acid 
construct capable of directing the expression of a gene of interest and which can transfer 
gene sequences to target cells. Thus, the term includes cloning, and expression vehicles, as 

25 well as integrating vectors. 

"Nucleic acid expression vector" or "expression cassette" refers to an assembly 
which is capable of directing the expression of a sequence or gene of interest. The nucleic 
acid expression vector includes a promoter which is operably linked to the sequences or 
gene(s) of interest. Other control elements may be present as well. Expression cassettes 
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described herein may be contained within a plasmid construct. In addition to the 
components of the expression cassette, the plasmid construct may also include a bacterial 
origin of replication, one or more selectable markers, a signal which allows the plasmid 
construct to exist as single-stranded DNA (e.g., a Ml 3 origin of replication), a multiple 
5 cloning site, and a "mammalian" origin of replication (e.g., a SV40 or adenovirus origin of 
replication). 

An "expression cassette" comprises any nucleic acid construct capable of directing 
the expression of a gene/coding sequence of interest. Such cassettes can be constructed into 
a "vector," "vector construct," "expression vector," or "gene transfer vector," in order to 

O 10 transfer the expression cassette into target cells. Thus, the term includes cloning and 

J2 expression vehicles, as well as viral vectors. 

J>J "Luciferase," unless stated otherwise, includes prokaryotic and eukaryotic 

luciferases, as well as variants possessing varied or altered optical properties, such as 
IB luciferases that produce different colors of light (e.g., Kajiyama, N., and Nakano, E., Protein 

L 15 Engineering 4(6):691-693 (1991)). 

fy "Light-generating" is defined as capable of generating light through a chemical 

'£ 

m reaction or through the absorption of radiation. 

~f A "light generating protein" or "light-emitting protein" is a protein capable of 

generating light in the visible spectrum (between approximately 350 nm and 800 nm). 
20 Examples include bioluminescent protiens such as luciferases, e.g., bacterial and firefly 
luciferases, as well as fluorescent proteins such as green fluorescent protein (GFP). 

"Light" is defined herein, unless stated otherwise, as electromagnetic radiation 
having a wavelength of between about 300 nm and about 1100 nm. 

"Animal" as used herein typically refers to a non-human mammal, including, without 
25 limitation, farm animals such as cattle, sheep, pigs, goats and horses; domestic mammals 
such as dogs and cats; laboratory animals including rodents such as mice, rats and guinea 
pigs; birds, including domestic, wild and game birds such as chickens, turkeys and other 
gallinaceous birds, ducks, geese, and the like. The term does not denote a particular age. 
Thus, both adult and newborn individuals are intended to be covered. 
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A "transgenic animal" refers to a genetically engineered animal or offspring of 
genetically engineered animals, A transgenic animal usually contains material from at least 
one unrelated organism, such as from a virus, plant, or other animal. The "non-human 
animals" of the invention include vertebrates such as rodents, non-human primates, sheep, 

5 dogs, cows, amphibians, birds, fish, insects, reptiles, etc. The term "chimeric animal" is 
used to refer to animals in which the heterologous gene is found, or in which the 
heterologous gene is expressed in some but not all cells of the animal. 

"Analyte" as used herein refers to any compound or substance whose effects (e.g., 
induction or repression of a specific promoter) can be evaluated using the test animals and 

10 methods of the present invention. Such analytes include, but are not limited to, chemical 
compounds, pharmaceutical compounds, polypeptides, peptides, polynucleotides, and 
polynucleotide analogs. Many organizations (e.g., the National Institutes of Health, 
pharmaceutical and chemical corporations) have large libraries of chemical or biological 
compounds from natural or synthetic processes, or fermentation broths or extracts. Such 

15 compounds/analytes can be employed in the practice of the present invention. 

As used herein, the term "positive selection marker" refers to a gene encoding a 
product that enables only the cells that carry the gene to survive and/or grow under certain 
conditions. For example, plant and animal cells that express the introduced neomycin 
resistance (Neo r ) gene are resistant to the compound G418. Cells that do not carry the Neo r 

20 gene marker are killed by G418. Other positive selection markers will be known to those of 
skill in the art. Typically, positive selection markers encode products that can be readily 
asssayed. Thus, positive selection markers can be used to determine whether a particular 
DNA construct has been introduced into a cell, organ or tissue. 

"Negative selection marker" refers to gene encoding a product which can be used to 

25 selectively kill and/or inhibit growth of cells under certain conditions. Non-limiting 
examples of negative selection inserts include a herpes simplex virus (HS V)-thymidine 
kinase (TK) gene. Cells containing an active HSV-TK gene are incapable of growing in the 
presence of gangcylovir or similar agents. Thus, depending on the substrate, some gene 
products can act as either positive or negative selection markers. 



PATENT 
PXE-012.US 



17 

The term "homologous recombination" refers to the exchange of DN A fragments 
between two DNA molecules or chromatids at the site of essentially identical nucleotide 
sequences. It is understood that substantially homologous sequences can accommodate 
insertions, deletions, and substitutions in the nucleotide sequence. Thus, linear sequences of 

5 nucleotides can be essentially identical even if some of the nucleotide residues do not 
precisely correspond or align (see, above). 

A "knock-out" mutation refers to partial or complete loss of expression of at least a 
portion the target gene. Examples of knock-out mutations include, but are not limited to, 
gene-replacement by heterologous sequences, gene disruption by heterologous sequences, 

10 and deletion of essential elements of the gene (e.g., promoter region, portions of a coding 
sequence). A "knock-out" mutation is typically identified by the phenotype generated by 
the mutation. 

A "single-copy gene" as used herein refers to a gene represented in an organism's 
genome only by a single copy at a particular chromosomal locus. Accordingly, a diploid 
15 organism has two copies of the gene and both copies occur at the same chromosomal 
location. 

A "gene" as used in the context of the present invention is a sequence of nucleotides 
in a genetic nucleic acid (chromosome, plasmid, etc.) with which a genetic function is 
associated. A gene is a hereditary unit, for example of an organism, comprising a 

20 polynucleotide sequence (e.g., a DNA sequence for mammals) that occupies a specific 

physical location (a "gene locus" or "genetic locus") within the genome of an organism. A 
gene can encode an expressed product, such as a polypeptide or a polynucleotide (e.g., 
tRNA). Alternatively, a gene may define a genomic location for a particular event/function, 
such as the binding of proteins and/or nucleic acids (e.g., phage attachment sites), wherein 

25 the gene does not encode an expressed product. Typically, a gene includes coding 

sequences, such as, polypeptide encoding sequences, and non-coding sequences, such as, 
promoter sequences, poly-adenlyation sequences, transcriptional regulatory sequences (e.g., 
enhancer sequences). Many eucaryotic genes have "exons" (coding sequences) interrupted 
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by "introns" (non-coding sequences). In certain cases, a gene may share sequences with 
another gene(s) (e.g., overlapping genes). 

"Isogenic" means two or more organisms or cells that are considered to be 
genetically identical. "Substantially isogenic" means two or more organisms or cells 

5 wherein, at the majority of genetic loci (e.g., greater than 99.000%, preferably more than 
99.900%, more preferably greater than 99.990%, even more preferably greater than 
99.999%), there exists genetic identity between the organisms or cells being compared. In 
the context of the present invention, two organisms (for example, mice) are considered to be 
"substantially isogenic" if, for example, inserted transgenes are the primary differences 

10 between the genetic make-up of the mice being compared. Further, if, for example, the 

genetic backgrounds of the mice being compared are the same with the exception that one of 
the mice has one or several defined mutation(s) (for example, affecting coat color), then 
these mice are considered to be substantially isogenic. An example of two strains of 
substantially isogenic mice are C57BL/6 and C57BL/6-Tyr C2j/+. 

15 A "pseudogene" as used herein, refers to a type of gene sequence found in the 

genomes, typically, of eucaryotes, where the sequence closely resembles a known functional 
gene, but differs in that the pseudogene is non-functional. For example, the pseudogene 
sequence may contain several stop codons in what would correspond to an open reading 
frame in the functional gene. Pseudogenes can also have deletions or insertions relative to 

20 their corresponding functional gene. If, for example, in a genome there is a functional gene 
and a related pseudogene, the functional gene is considered to be a single-copy gene 
(accordingly, the pseudogene is considered to be single-copy as well). 

A "non-essential gene" refers to a gene whose deletion, disruption, elimination, 
reduction of gene function, or mutation is non-lethal, and does not obviously adversely 

25 affect the organisms' ability to mature and reproduce. A "non-essential gene with no 

phenotype" refers to a non-essential gene whose deletion, disruption, elimination, reduction 
of gene function or mutation has no deleterious effect on the organism. Typically there are 
no phenotypically reflected gene dosage effects associated with modification of a non- 
essential gene with no phenotype ~ for example, deletion, disruption or mutation of both 
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copies of a non-essential gene with no phenotype in a diploid organism has essentially the 
same effect as deletion, disruption, or mutation of one of the two copies present in the 
diploid organism. In the context of the present invention, a non-essential gene is typically 
one whose function has been eliminated (e.g., by a deletion mutation) and such elimination 

5 of function was non-lethal and the organism developed, matured, and was able to reproduce. 
The "native sequence" or "wild-type sequence" of a gene is the polynucleotide 
sequence that comprises the genetic locus corresponding to the gene, e.g., all regulatory and 
open-reading frame coding sequences required for expression of a completely functional 
gene product as they are present in the wild-type genome of an organism. The native 

10 sequence of a gene can include, for example, transcriptional promoter sequences, translation 
enhancing sequences, introns, exons, and poly- A processing signal sites. It is noted that in 
the general population, wild-type genes may include multiple prevalent versions that contain 
alterations in sequence relative to each other and yet do not cause a discernible pathological 
effect. These variations are designated "polymorphisms" or "allelic variations." 

15 By "replacement sequence" is meant a polynucleotide sequence that is substituted for 

at least a portion of the native or wild-type sequence of a gene. 

"Linear vector" or "linearized vector," as used herein, is a vector having two ends. 
For example, circular vectors, such as plasmids, can be linearized by digestion with a 
restriction endonuclease that cuts at a single site in the plasmid. Preferably, the targeting 

20 vectors described herein are linearized such that the ends are not within the targeting 
sequences. 

General Overview 

In one aspect, the present invention relates to vector constructs, cells containing the 
25 constructs, methods of screening compounds, and methods of creating transgenic animals to 
be used, for example, as screening or test systems. Methods of using the constructs, cells, 
and transgenic animals of the present invention include, but are not limited to, studies 
involving tumor growth and other disease conditions. Exemplary promoters useful in the 
practice of the present invention include mouse VEGFR-2 and mouse Tie2. In one 
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embodiment, the present invention relates to novel promoters for the mouse VEGFR-2 
receptor gene, nucleic acid constructs comprising such promoters operatively linked to 
genes encoding a gene product, such as, a reporter, a protein, polypeptide, hormone, 
ribozyme, or antisense RNA, recombinant cells comprising such nucleic acid constructs, 

5 screening for therapeutic drugs using such cells (e.g., screening for compounds that 

modulate VEGFR-2-mediated angiogenesis), and endothelial tissue-specific gene expression 
using these novel promoter sequences. 

In yet another aspect of the present invention, transgenic, non-human mammals are 
constructed where a single-copy, non-essential gene is replaced by a reporter expression 

10 cassette, preferably a gene encoding a light-generating protein, such as a luciferase-encoding 
gene, operably linked to a promoter. A variety of promoters are useful in the practice of the 
present invention, for example, promoters derived from genes associated with tumorigenesis 
or angiogenesis. Thus, an exemplary promoter can be one that is associated with proteins 
induced during tumorigenesis, for instance in the presence of tumor generating compounds 

15 or of tumors themselves. In this way, expression of the reporter cassette is induced in the 
animal when, for example, tumors are present, and progression of the tumor can be 
evaluated by non-invasive imaging methods using the whole animal. Another exemplary 
promoter is one that is derived from a gene associated with angiogenesis. Because the 
promoter is linked to a reporter such as luciferase, non-invasive monitoring of the 

20 progression of angiogenesis is possible. Various forms of the different embodiments of the 
invention, described herein, may be combined. 

Non-invasive imaging and/or detecting of light-emitting conjugates in mammalian 
subjects was described in U.S. Patent No. 5,650,135, by Contag, et al., issued 22 July 1997, 
and herein incorporated by reference. This imaging technology can be used in the practice 

25 of the present invention in view of the teachings of the present specification. In the imaging 
method, the conjugates contain a biocompatible entity and a light-generating moiety. 
Biocompatible entities include, but are not limited to, small molecules such as cyclic organic 
molecules; macromolecules such as proteins; microorganisms such as viruses, bacteria, 
yeast and fungi; eukaryotic cells; all types of pathogens and pathogenic substances; and 
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particles such as beads and liposomes. In another aspect, biocompatible entities maybe all 
or some of the cells that constitute the mammalian subject being imaged, for example, cells 
carrying the vector constructs of the present invention expressing a reporter expression 
cassette. 

5 Light-emitting capability is conferred on the biocompatible entities by the 

conjugation of a light-generating moiety. Such moieties include fluorescent molecules, 
fluorescent proteins, enzymatic reactions giving off photons and luminescent substances, 
such as bioluminescent proteins. In the context of the present invention, light emitting 
capability is typically confered on target cells by having at least one copy of a light- 

10 generating protein, e.g., a luciferase, present. In preferred embodiments, luciferase is 
operably linked to appropriate control elements which can facilitate expression of a 
polypeptide having luciferase activity. Substrates of luciferase can be endogenous to the 
cell or applied to the cell or system (e.g., injection into a transgenic mouse, having cells 
carrying a luciferase construct, of a suitable substrate for the luciferase, for example, 

15 luciferin). The conjugation may involve a chemical coupling step, genetic engineering of a 
fusion protein, or the transformation of a cell, microorganism or animal to express a light- 
generating protein. 

Targeting Constructs 

20 The targeting cassettes described herein typically include the following components: 

(1) a suitable vector backbone; (2) a polynucleotide encoding a light generating protein, (3) 
a promoter operably linked to the luciferase-encoding gene, wherein the promoter is 
heterologous to the coding sequences of the light generating protein; (4) a sequence 
encoding a positive selection marker; (5) insertion sites flanking the sequence encoding the 

25 positive selection marker and the polynucleotide encoding a light generating protein gene, 
for insertion of sequences which target a single-copy, non-essential chromosomal gene; and, 
optionally, (6) a sequence encoding a negative selection marker. Exemplary targeting 
constructs are shown in Figures 3B, 13 and 18 and described in Examples 1-3. 



PATENT 
PXE-012.US 

22 

Suitable vector backbones generally include an Fl origin of replication; a colEl 
plasmid-derived origin of replication; polyadenylation sequence(s); sequences encoding 
antibiotic resistance (e.g., ampicillin resistance) and other regulatory or control elements. 
Non-limiting examples of appropriate backbones include: pBluescriptSK (Stratagene, La 
5 Jolla, CA); pBluescriptKS (Stratagene, La Jolla, CA) and other commercially available 
vectors. 

In one aspect of the invention, the light generating protein is luciferase. Luciferase 
coding sequences useful in the practice of the present invention include sequences obtained 
from lux genes (procaryotic genes encoding a luciferase activity) and luc genes (eucaryotic 

10 genes encoding a luciferase activity). A variety of luciferase encoding genes have been 
identified including, but not limited to, the following: B.A. Sherf and K.V. Wood, U.S. 
Patent No. 5,670,356, issued 23 September 1997; Kazami, J., et al., U.S. Patent No. 
5,604,123, issued 18 February 1997; S. Zenno, et al, U.S. Patent No. 5,618,722; K.V. Wood, 
U.S. Patent No. 5,650,289, issued 22 July 1997; K.V. Wood, U.S. Patent No. 5,641,641, 

15 issued 24 June 1997; N. Kajiyama and E. Nakano, U.S. Patent No. 5,229,285, issued 20 July 
1993; M.J. Cormier and W.W. Lorenz, U.S. Patent No. 5,292,658, issued 8 March 1994; 
M.J. Cormier and W.W. Lorenz, U.S. Patent No. 5,418,155, issued 23 May 1995; de Wet, 
J.R., et al, Molec. Cell Biol 7:725-737, 1987; Tatsumi, H.N., et al, Biochim. Biophys. Acta 
1 131:161-165, 1992; and Wood, K.V., et al, Science 244:700-702, 1989; all herein 

20 incorporated by reference. Eukaryotic luciferase catalyzes a reaction using luciferin as a 
luminescent substrate to produce light, whereas prokaryotic luciferase catalyzes a reaction 
using an aldehyde as a luminescent substrate to produce light. 

Wild-type firefly luciferases typically have an emission maxima at about 550 nm. 
Numerous variants with differing emission maxima have also been studied. For example, 

25 Kajiyama and Nakano (Protein Eng. 4(6):691-693, 1991; U.S. Patent No. 5,330,906, issued 
19 July 1994, herein incorporated by reference) teach five variant firefly luciferases 
generated by single amino acid changes to the Luciola cruciata luciferase coding sequence. 
The variants have emission peaks of 558 nm, 595 nm, 607 nm, 609 nm and 612 nm. A 
yellow-green luciferase with an emission peak of about 540 nm is commerically available 
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from Promega, Madison, WI under the name pGL3. A red lucif erase with an emission peak 
of about 610 nm is described, for example, in Contag et al. (1998) Nat. Med. 4:245-247 and 
Kajiyama et al. (1991) Prot. Eng. 4:691-693. 

Positive selection markers include any gene which a product that can be readily 
5 asssayed. Examples include, but are not limited to, a hprt gene (Littlefield, J. W., Science 
145:709-710 (1964), herein incorporated by reference), a xanthine-guanine 
phosphoribosyltransferase (gpt) gene, or an adenosine phosphoribosyltransferase (aprt) gene 
(Sambrook et al., supra), a thymidine kinase gene (i.e "TK") and especially the TK gene of 
herpes simplex virus (Giphart-Gassler, M. et al., Mutat. Res. 214:223-232 (1989) herein 
10 incorporated by reference), a nptH gene (Thomas, K. R. et al., Cell 51:503-512 (1987); 
Mansour, S. L. et al., Nature 336:348-352 (1988), both references herein incorporated by 
reference), or other genes which confer resistance to amino acid or nucleoside analogues, or 
antibiotics, etc, for example, gene sequences which encode enzymes such as dihydrofolate 
reductase (DHFR) enzyme, adenosine deaminase (ADA), asparagine synthetase (AS), 
15 hygromycin B phosphotransferase, or a CAD enzyme (carbamyl phosphate synthetase, 
aspartate transcarbamylase, and dihydroorotase). Addition of the appropriate substrate of 
the positive selection marker can be used to determine if the product of the positive selection 
marker is expressed, for example cells which do not express the positive selection marker 
nptH, are killed when exposed to the substrate G418 (Gibco BRL Life Technology, 
20 Gaithersburg, MD). 

The targeting vector typically contains insertion sites for inserting targeting 
sequences (e.g., sequences that are substantially homologous to the target sequences in the 
host genome where integration of the targeting vector/expression cassette is desired). These 
insertion sites are preferably included such that there are two sites, one site on either side of 
25 the sequences encoding the positive selection marker, luciferase and the promoter. Insertion 
sites are, for example, restriction endonuclease recognition sites, and can, for example, 
represent unique restriction sites. In this way, the vector can be digested with the 
appropriate enzymes and the targeting sequences ligated into the vector. 
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Optionally, the targeting construct can contain a polynucleotide encoding a negative 
selection marker. Suitable negative selection markers include, but are not limited to, HSV- 
tk (see, e.g., Majzoub et al. (1996) New Engl. J. Med. 334:904-907 and U.S. Patent No. 
5,464,764), as well as genes encoding various toxins including the diphtheria toxin, the 
tetanus toxin, the cholera toxin and the pertussis toxin. A further negative selection marker 
gene is the hypoxanthine-guanine phosphoribosyl transferase (HPRT) gene for negative 
selection in 6-thioguanine. 

Exemplary promoters and single-copy, non-essential genes for use in the vector 
constructs and methods of the present invention are described below. 

Promoters 

The targeting constructs and transgenic animals described herein contain a sequence 
encoding a luciferase gene operably linked to a promoter. The promoter may be from the 
same species as the transgenic animal (e.g., mouse promoter used in construct to make 
transgenic mouse) or from a different species (e.g., human promoter used in construct to 
make transgenic mouse). The promoter can be derived from any gene of interest. In one 
embodiment of the present invention, the promoter is derived from a gene whose expression 
is induced during angiogenesis, for example pathogenic angiogenesis like tumor 
development. Thus, when a tumor begins to develop in a transgenic animal carrying a 
vector construct of the present invention, the promoter is induced and the animal expresses 
luciferase, which can then be monitored in vivo. 

Exemplary promoters for use in the present invention are selected such that they are 
functional in a cell type and/or animal into which they are being introduced. Exemplary 
promoters include, but are not limited to, promoters obtained from the following mouse 
genes: vascular endothelial growth factor (VEGF) (VEGF promoter described in U.S. 
Patent No. 5,916,763; Shima et al. (1996) /. Bio. Chem. 271:3877-3883; sequence available 
on NCBI under accession number U41383); VEGFR2, also known as Flk-1, (VEGFR-2 
promoter described, for example, in Ronicke et al. (1996) Circ. Res. 79:277-285; Patterson 
et al. (1995) /. Bio. Chem. 270:23111-23118; Kappel et al. (1999) Blood 93:4282-4292; 



PATENT 
PXE-012.US 



25 

sequence available as accession number X89777 of NCBI database); Tie2, also known as 
Tek (Tie2 promoter described, for example, in Fadel et al. (1998) Biochem. J. 338:335-343; 
Schlaeger et al. (1995) Develop. 121:1089-1098; Schlager et al. (1997) PNAS USA 94:3058- 
3063). VEGF is a specific mitogen for EC in vitro and a potent angiogenic factor in vivo. In 
5 a tumorigenesis study, it was shown that VEGF was critical for the initial subcutaneous 
growth of T-47D breast carcinoma cells transplanted into nude mice, whereas other 
angiogenic factors, such as, bFGF can compensate for the loss of VEGF after the tumors 
have reached a certain size (Yoshiji, H., et al.,1997 Cancer Research 57: 3924-28). VEGF is 
a major mediator of aberrant EC proliferation and vascular permeability in a variety of 
10 human pathologic situation, such as, tumor angiogenesis, diabetic retinopathy and 

rheumatoid arthritis (Benjamin LE, et al.,1997 PNAS 94: 8761-66; Soker, S., et al.,1998 
Cell 92: 735-745). VEGF is synthesized by tumor cells in vivo and accumulates in nearby 
blood vessels. Because leaky tumor vessels initiate a cascade of events, which include 
plasma extravasation and which lead ultimately to angiogenesis and tumor stroma 
15 formation, VEGF plays a pivotal role in promoting tumor growth (Dvorak, H.F., et al., 1991 
J Exp Med 174:1275-8). VEGF expression was upregulated by hypoxia (Shweiki, D., et al., 
1992 Nature 359: 843-5). VEGF is also upregulated by overexpression of v-Src oncogene 
(Mukhopadhyay. D., et al.,1995 Cancer Res. 15: 6161-5), c-SRC (Mukhopadhyay, D., et al., 
1995 Nature 375: 577-81), and mutant ras oncogene (Plate, K.H., et al., 1992 Nature 359: 
20 845-8). The tumor suppressor p53 downregulates VEGF expression (Mukhopadhyay. D., et 
al.,1995 Cancer Res. 15: 6161-5). 

A number of cytokines and growth factors, including PGF and TPA (Grugel, S., et 
al., 1995 J. Biological Chem. 270: 25915-9), EGF, TGF-b, IL-1, IL-6 induce VEGF mRNA 
expression in certain type of cells (Ferrara, N., et al., 1997 Endocr. Rev. 18: 4-25). Kaposi's 
25 sarcoma-associated herpesvirus (KSHV) encoded a G-protein-coupled receptor, a homolog 
of IL-8 receptor, can activate JNK/SAPK and p38MAPK and increase VEGF production, 
thus causing cell transformation and tumorigenicity (Bais, C, et al., Nature 1998 391:86-9). 
VEGF overexpression in skin of transgenic mice induces angiogenesis, 
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vascularhyperpermeability and accelerated tumor development (Larcher, F., et al., Oncogene 
1998 17:303-11). 

VEGF-B (cDNA sequences available on databases) is a mitogen for EC and may be 
involved in angiogenesis in muscle and heart (Olofsson, B., et al., 1996 Proc Natl Acad Sci 
USA 93:2576-81). Shown in vitro, binding of VEGF-B to its receptor VEGFR-1 leads to 
increased expression and activity of urokinase type plasminogen activator and plasminogen 
activator inhibitor, suggesting a role for VEGF-B in the regulation of extracellular matrix 
degradation, cell adhesion, and migration (Olofsson, B., et al., 1998 Proc Natl Acad Sci U S 
A 95:11709-14). 

VEGF-C (see, e.g., U.S. Patent No. 5,916,763 and Shima et al., supra) may regulate 
angiogenesis of lymphatic vasculature, as suggested by the pattern of VEGF-C expression in 
mouse embryos (Kukk, E., et al., 1996 Development 122: 3829-37). Although VEGF-C is 
also a ligand for VEGFR-2, the functional significance of this potential interaction is 
unknown. Overexpression of VEGF-C in the skin of transgenic mice resulted in lymphatic, 
but not vascular, endothelial proliferation and vessel enlargement, suggesting the major 
function of VEGF-C is through VEGFR-3 rather than VEGFR-2 (Jeltsch M, et al., 1997 
Science 276:1423-5). Shown by the CAM assay, VEGF and VEGF-C are specific 
angiogenic and lymphangiogenic growth factors, respectively (Oh, S.J., et al., (1997) Devel 
Biol 188: 96-109). VEGF-C overexpression in the skin of transgenic mice resulted in 
lymphatic, but not vascular, endothelial proliferation and vessel enlargement (Jeltsch M, et 
al., 1997 Science 276:1423-5). 

VEGF-D (cDNA sequences available on databases) is a mitogen for EC. Given that 
VEGF-D can also activate VEGFR-3. it is possible that VEGF-D could be involved in the 
regulation of growth and/or differentiation of lymphatic endothelium (Achen, M.G., et al., 
1998 Proc Natl Acad Sci U S A 95: 548-53). VEGF-D is induced by transcription factor c- 
Fos in mouse (Orlandini, M., 1996 PNAS 93: 1 1675-80). 

VEGFR-1 signaling pathway may regulate normal endothelial cell-cell or cell matrix 
interactions during vascular development, as suggested by the knockout study (Fong, G.H., 
et al., 1995 Nature 376: 65-69). Although VEGFR-1 has a higher affinity to VEGF than 
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VEGFR-2, it does not transduce the mitogenic signals of VEGF in ECs (Soker, S., et 
al.,1998 Cell 92: 735-745). VEGFR-2 (see, e.g., Ronicke et al., Patterson et al., Kappel et 
al. (1999), supra) appears to be the major transducer of VEGF signals in EC that result in 
chemotaxis, mitogenicity and gross morphological changes in target cells (Soker, S., et 

5 al.,1998 Cell 92: 735-745). The cloning and sequencing of the 4.5 kb VEGFR2 promoter 
region is described herein (Example 3). 

VEGFR-3 has an essential role in the development of the embryonic cardiovascular 
system before the emergence of lymphatic vessels, as shown by the knockout study 
(Dumont, D.J., et al., 1998 Science 282: 946-949). Neuropillin-1 (see, e.g., Soker et al. 

10 (1998) Cell 92:735-745) is a receptor for VEGF165. It can enhance the binding of VEGF 165 
to VEGFR-2 and VEGF 165 mediated chemotaxis (Soker, S., et al.,1998 Cell 92: 735-745). 
Neuropillinl overexpression in transgenic mice resulted in embryonic lethality. The 
embryos possessed excess capillaries and blood vessels. Dilated vessels and hemorrhage 
were also observed (Kitsukawa, T., et al., 1995 Development 121: 4309-18). 

15 Further promoters of interest include, but are not limited to, the following. Ang2 is 

expressed only at predominant vascular remodeling sites, such as ovary, placenta, uterus 
(Maisonpierre, P.C., et al., 1997 Science 277: 55-60). In glioblastoma angiogenesis, Ang2 is 
found to be expressed in endothelial cells of small blood vessel and capilaries while Angl is 
expressed in glioblastoma tumor cells (Stratmann, A., 1998 Am J Pathol 153: 1459-66). 

20 Ang2 is up-regulated in bovine microvascular endothelial by VEGF, bFGF, cyrokines, 
hypoxia (Mandriota, S.J., 1998 Circ Res 83: 852-9). Ang2 transgenic overexpression 
disrupts angiogenesis, and is embryonic lethal (Maisonpierre, P.C., et al., 1997 Science 277: 
55-60). Angl is widely expressed, less aboundant in heart and liver (Maisonpierre, P.C, et 
al., 1997 Science 277: 55-60). Angl is expressed in mesenchymal cells and may up-regulate 

25 the expression of Tie2 in the endothelial cells (Suri,C, et al., 1996 Cell 87: 1171-1180). 
Angl overexpression in the skin of transgenic mice produces larger, more numerous, and 
more highly branched vessels (Suri, C, et aL, Science 1998 282:468-71). 

Tie2 (see, e.g., Fadel et al.; Schlaeger et al. (1995), and Schlager et al. (1997), supra) 
is edothelial cell specific, up-regulated during wound healing, follicle maturation (Puri, 
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M.C., et al., 1995 EMBO J 14: 5884-91) and pathologic angiogenesis (Kaipainen, A., 1994 
Cancer Research 54: 6571-77), such as, glioblastoma (Stratmann, A., 1998 Am J Pathol 153: 
1459-66). Tie2 is also expressed in non-proliferating adult endothelium and edothelial cell 
lines (Dumont, D.J., et al. (1994) Genes & Develop. 8:1897-1909). A Tie2 activating 
5 mutation causes vascular dysmorphorgenesis (Vikkula M, et al, 1996 Cell 87: 1181-1 190). 
Tie2 mutant overexpression in transgenic mice is embryonic lethal (Dumont, D J., et al., 
supra). The cloning and sequencing of the 7.1 kb promoter region of Tie2 is described 
herein (Example 3). 

Other promoters useful in the practice of the present invention include, by way of 

10 example, promoters derived from the sequences encoding the following polypeptide 

products: PTEN (dual specificity phosphatase); BAI (brain-specific angiogenesis inhibitor); 
KAI1 (KANGAI 1); catenin beta-1 (cadherin-associated protein, beta); COX2 (PTGS2 
cyclooxygenase 2, a.k.a. prostaglandin-endoperoxide synthase 2); MMP2 (72 kDa Type IV- 
A collagenase); MMP9 (92 kDa type IV-B collagenase); TBMP2 (tissue inhibitor of 

15 metalloproteinase 2); and TIMP3 (tissue inhibitor of metalloproteinase 3). 

PTEN is a tumor suppressor gene and encodes a protein of 403 amino acids. (Li et 
al. (1997) Science 275:1943-1946; DiCristofano et al (1998) Nature Genet. 19:348-355). 
Overexpression of PTEN has been shown to inhibit cell migration and it is postulated that 
this protein may function as a tumor suppressor by negatively regulating cell interactions 

20 with the extracellular matrix or by negatively regulating the PBKTPKB/Akt signaling 

pathway. (Tamura et al. (1998) Science 280:1614-1617; Stambolic et al. (1998) Cell 95:29- 
29). Mutations in PTEN have been detected in cancer cell lines and in the germline of 
patients having Cowden disease, Lhermitte-Duclos disease and Bannayan-Zonana syndrome 
(diseases and syndromes which are characterized by hyperplastic/dysplastic changes in the 

25 prostate, skin and colon and which are associated with an increased risk of certain cancers, 
for example, breast cancer, prostate cancer and colon cancer). (Marsh et al. (1998) Hum. 
Molec. Genet. 7:507-515; Marsh et al (1998) J. Med. Genet. 35:881-885; Nelen et al. (1997) 
Hum. Molec. Genet. 6:1383-1387). 
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BAI1 protein is predicted to be 1,584 amino acids in length and includes an 
extracellular domain, an intracellular domain and a 7-span transmembrane region similar to 
that of the secretin receptor. (Nishimori et aL (1997) Oncogene 15:245-2150). The 
extracellular region of BAI1 has a single Arg-Gly-Asp (RGD) motif recognized by integrins 
5 and also has five sequences corresponding to the thrombospondin type I (accession number 
188060) repeats that can inhibit angiogenesis includes by basic fibroblast growth factor 
(bFGF, accession number 134920). Shiratsuchi et al. (1997) Cytogenet. Cell Genet 79:103- 
108, cloned 2 other brain-specific angiogenesis inhibiting genes, designated BAI2 
(accession number 602683) and BAI 3 (accession number 602684). Thus, it is postulated 

10 that members of this gene family may play a role in suppression of glioblastoma. 

KAI1 encodes a 267 amino acid protein which is a member of the leukocyte surface 
glyoprotein family. The protein has 4 hydrophobobic transmembrane domains and 1 large 
extracellular hydrophilic domain with three potential N-glycosylation sites. (Dong et al. 
(1995) Science 268:884-886). Molecular analysis of KAI1 is described, for example, in 

15 Dong et al. (1997) Genomics 41 :25-32. KAI1 is a tumor metastasis suppressor gene that is 
capable of inhibiting the metastatic process in experimental animals. Expression of KAI1 is 
downregulated during tumor progression of prostate, breast, lung, bladder and pancreatic 
cancers in humans, apparently at the transcriptional or postranscriptional level. Mashimo et 
al (1998) PNAS USA 95: 1 1307-1 1311, found that the tumor suppressor gene p53 can 

20 directly inactivate the KAI1 gene by interacting with the region 5' to the coding sequence, 
suggesting a direct relationship between p53 and KAIL 

Catenin beta-1 is an adherens junction (AJ) protein, which are critical for 
establishing and maintaining epithelial cell layers, for instance during embryogenesis, 
wound healing and tumor cell metastasis. Molecular analysis, including description of 

25 sequence homology to plakoglobin (accession number 173325), homology to the drosophila 
gene "armadillo" and interactions with Lefl/Tcf DNA binding proteins, is described, for 
example, in Nollet et al. (1996) Genomics 32:413-424; McCrea et al. (1991) Science 
254:1359-1361 and Korinek et al. (1997) Science 275:1784-1787. In addition, studies by 
Korinek et al., supra and Morin et al.(1997) Science 275:1787-1790, have indicated that 
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APC (accession number 175100) negatively regulates catenin beta and that regulation of this 
protein is critical to the tumor suppressive effect of APC. Abnormally high levels of beta- 
catenin have been detected in certain human melanoma cell lines. (Rubinfeld et al. (1997) 
Science 275: 1790-1792. Koch et al. (1999) Cancer Res. 59:269-273 report that childhood 
5 hepatoblastomas frequently carry a mutated degradation targeting box of the beta-catenin 
gene. Transgenic mice which express catenin beta under the control of an epidermal 
promoter undergo de novo hair morphogenesis and eventually these animals develop two 
types of tumors - epithelioid cysts and trichofolliculomas. Gat et al. (1998) Cell 95:605- 
614. 

10 COX2 encodes a cyclooxygenase and is a key regulator of prostaglandin synthesis. 

(Hla et al. (1992) PNAS USA 89:7384-7388; Jones et al. (1993) /. Biol. Chem. 268:9049- 
9054). In particular, COX2 is generally considered to be a mediator of inflammation and 
overexpression of COX2 in rat epithelial cells results in elevated levels of E-cadherin and 
Bcl2. (Tsujii & DuBois (1995) Cell 83:493-501). In co-cultures of endothelial cells and 

15 colon carcinoma cells, cells that overexpress COX2 produce prostaglandins, proangiogenic 
factors and stimulate both endothelial migration and tube formation. (Tsujii et al. (1998) 
Cell 93:705-716). Experiments conducted using APC knock-out mice have demonstrated 
that animals homozygous for a disrupted COX2 locus develop significantly more 
adenomatous polyps. (Oshima et al. (1996) Cell 87:803-809). COX-2 "knock out" mice 

20 develop severe nephropathy, are susceptible to peritonitis, exhibit reduced arachidonic acid- 
induced inflammation and exhibit reduced indomethacin-induced gastric ulceration. 
(Morham et al. (1995) Cell 83:473-482; Langenbach et al. (1995) Cell 83:483-492). Female 
mice that are deficient in cyclooxygenase 2 exhibit multiple reproductive failures. (Lim et 
al. (1997) Cell 91: 197-208. 

25 MMP2 is a metalloproteinase that specifically cleaves type IV collagen. A C- 

terminal fragment of MMP2, termed PEX, prevents normal biding to alpha- V/beta-3 and 
disrupts angiogenesis and tumor growth. (Brooks et al. (1998) Cell 92:391-400). 
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MMP9 is a collagenase secreted from normal skin fibroblasts. MMP9 null mice 
exhibit an abnormal pattern of skeletal growth plate vascularization and ossification. (Vu et 
al. (1998) Cell 93:411-422). 

TMP2 is a collagenase and appears to play a major role in modulating the activity of 

5 interstitial collagenase and a number of connective tissue metalloendoproteases. (Stetler- 
Stevenson et al. (1989) J. Biol Chem. 264:17372-17378). Unlike TMP1 and TMP3, 
TTMP2 is not upregulated by TPA or TGF-beta. (Hammani et al. (1996) /. Biol Chem. 
271:25498-25505). 

TEMP3 (Wilde et al. (1994) DNA Cell Biol 13:71 1-718) is localized in the 

10 extracellular matrix in both its glycosylated and unglycosylated forms. Studies of mutant 
TIMP3 proteins have demonstrated that C-terminal trunctions do not bind to the 
extracellular matrix. (Langton et al. (1998) X Biol Chem. 273:16778-16781). 

As one of skill in the art will appreciate in view of the teachings of the present 
specification, promoter sequences can be derived and isolated from known polypeptide 

15 sequences or from cDNA or genomic sequences, using method known in the art in view of 
the teachings herein, for example the promoter sequences of VEGFR2 and Tie2 were 
isolated and sequenced as described in Example 3 below. Another exemplary method of 
isolating promoter sequences using cDNA is via a GenomeWalker® kit, commercially 
available from Clontech (Palo Alto, CA), and described on page 27 of the 1997-1998 

20 Clontech catalog. 

Targeting Sequences: Non-Essential Genes 

Central to the present invention is the fact that the targeting constructs contain 
"targeting" sequences (flanking, for example, the luciferase-encoding sequence and 
25 promoter) derived from a single-copy, non-essential gene. These targeting sequences in the 
construct act via homologous recombination to replace at least a portion of the non-essential 
gene in the genome with the light-generating protein-encoding (e.g., luciferase-encoding) 
sequence operably linked to a promoter. 
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Non-limiting examples of targeting sequences for use in generating transgenic mice 
include sequences obtained from or derived from vitronectin, Fos B and galactin 3. A search 
of Mouse Knockout & Mutation Database (Genome Systems, Inc., St. Louis, MO) can be 
used to identify genes that have been knocked-out in mice where the generated knockout 

5 mice displayed no obvious defects. The chromosomal locus for all these genes can be used 
to target promoter-luciferase transgenes similar to what is described in Example 2. Single- 
copy, non-essential mouse genes identified in this manner include, but are not limited to, the 
following: Moesin (Msn), Doi Y., et al., J Biol Chem 1999, 274:2315-2321; Plasminogen 
activator inhibitor, type II (Planh2) and Planhl, Dougherty K.M., Proc Natl Acad Sci USA 

10 1999, 96:686-691; Protein tyrosine phosphatase, receptor type, B (Ptprb), Elchebly et al. 
(1999) Science 283:1544-1548; Presenilin 1 (Psenl), Guo Q, et al. (1999) Proc Natl Acad 
Sci USA, 96:4125-4130; Protein kinase, mitogen-activated 9 (Prkm9) / SAPK/Erk/kinase 2 
(Serk2), Kuan CY et al. (1999) Neuron 4:667-676; CD152 antigen (Cdl52) / CD86 antigen 
(Cd86) / CD80 antigen (Cd80), Mandelbrot DA, et al. (1999) J Exp Med, 189:435^40; 

15 Poly (ADP-ribose) polymerase (Adprp), Masutani M, et al. (1999), Proc Natl Acad Sci USA 
96:2301-2304; Sodium channel, nonvoltage-gated 1 beta (Scnnlb), Pradervand S, et al. 
(1999) Proc Natl Acad Sci USA 96:1732-1737; Nuclear receptor coactivator 1 (Ncoal), Qi 
C, et al. (1999) Proc Natl Acad Sci USA 96:1585-1590; Decay accelerating factor 1 (Dafl), 
Sun X, et al. (1999) Proc Natl Acad Sci USA 1999, 96:628-633; Necdin (Ndn), Tsai TF, et 

20 al. (1999) Nat Genet 22: 15-16; Relaxin (Rln); Zhao L, et al. (1999) Endocrinology 
140:445-453; Adenylyl cyclase 8 (Adcy8), Abdel-Majid RM, et al. (1998) Nat Genet 
19:289-291; Leukemia inhibitory factor (Lif), Bugga L, et al. (1998) J Neurobiol 36:509- 
524; Lectin, galactose binding, soluble 3 (Lgals3) and Lgalsl, Calnot C, et al. (1998) Dev 
Dyn 21 1:306-313; Urokinase plasminogen activator receptor (Plaur) Carmeliet P, et al. 

25 (1998) J Cell Biol 140:233-245; Nitric oxide synthase 1, neuronal (Nosl), Chao DS, et al. 
(1998) J Neurochem 71:784-789; Homeo box A7 (Hoxa7), Chen F, et al. (1998) Mech Dev 
77:49-57; Myosin light chain, phosphorylatable, cardiac ventricles (Mylpc) Chen J, et al. 
(1998) J Biol Chem 273:1252-1256; Homeo box B7 (Hoxb7),Chen F, et al. (1998) Mech 
Dev 77:49-57; Nuclear factor of kappa light chain gene enhancer in B-cells inhibitor, alpha 
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and beta (Nfkbia and Nfkbib), Cheng JD, et al. (1998) J Exp Med 6:1055-1062; Enolase 1, 
alpha non-neuron (Enol), Couldrey C, et al. (1998) Dev Dyn 212:284-292; Xeroderma 
pigmentosum, complimentation group A (Xpa), De Vries A, et al. (1998) Exp Eye Res 1998, 
67:53-59; Von Willebrand factor homolog (Vwf), Denis C, et al. (1998) Proc Natl Acad Sci 

5 USA 95:9524-9529; Lysosomal acid lipase 1 (Lipl), Du H, et al. (1998) Hum Mol Genet 
7:1347-1354; UNC-5 homolog (C. elegans) 3 (Unc5h3), Eisenman LM, et al. (1998) J 
Comp Neurol 394:106-117; Protein phosphatase 1, regulatory (inhibitor) subunit IB 
(Ppplrlb), Fienberg AA, et al. (1998) Science 281:838-842; Myelin-associated 
glycoprotein (Mag) FujitaN, et al. (1998) J Neurosci 18:1970-1978; Paraoxonase 1 (Ponl), 

10 Furlong CE, et al. (1998) Neurotoxicology 19:645-650; Brain derived neurotrophic factor 
(Bdnf), Garek RR, et al. (1998) Laryngoscope 108:671-678; Neurotrophin 3 (Ntf3), Garek 
RR, et al. (1998) Laryngoscope 168:671-678; Myoglobin (Mb); Garry DJ, et al. (1998) 
Nature 395:905-908; Opioid receptor, mu (Oprm), Gaveriaux-Ruff C, et al. (1998) Proc 
Natl Acad Sci USA 95:6326-6330; Neuropeptide Y (Npy), Hollopeter G, et al. (1998) Int J 

15 Obes Relat Metab Disord 22:506-512; Procollagen, type I, alpha 1 (Colal) Hormuzdi SG, et 
al. (1998) Mol Cell Biol 18:3368-3375; Centromere autoantigen B (Cenpb), Hudson DF, et 
al. (1998) J Cell Biol 141:309-319; Oculocerebrorenal syndrome of Lowe (ocrl), Janne PA, 
et al. (1998) J Clin Invest 101:2042-2053; arachidonate 12-lipoxygenase (Aloxl2) Johnson 
EN, et al. (1998) Proc Natl Acad Sci USA 95:3100-3105; H19 fetal liver mRNA (H19), 

20 Jones BK, et al. (1998) Genes Dev 12:2200-2207; Hepatocyte nuclear factor 3 gamma 
(winged helix transcription factor) (Hnf3g), Kaestner KH, et al. (1998) Mol Cell Biol 
18:4245^4251; Bone morphogenetic protein 2 (Bmp2) / Bone morphogenetic protein 7 
(Bmp7) and Bmp5, Katagiri T, et al. (1998) Dev Genet 22:340-348; Intercellular adhesion 
molecule (Icaml), Ley K, et al. (1998) Circ Res 83:287-294; Glutamyl aminopeptidase 

25 (Enpep), Lin Q, et al. J Immunol 1998, 160:4681-4687; Prion protein (Prnp), Lipp HP, et al. 
Behav Brain Res 1998, 95:47-54; RAB3A, member RAS oncogene family (Rab3a), Lonart 
G, et al. Neuron 1998, 21:1 141-1 150; Potassium voltage gated channel, shaker related 
subfamily, member 4 (Kcna4), London B, et al., J Physiol (Lond) 1998, 509: 171-182; 
Apurinic/apyrimidinic endonuclease (Apex), Ludwig DL, et al. Mutat Res 1998, 409:17-29; 
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T-cell receptor gamma, variable 5 (Tcrg-V5), Mallick-Wood CA, et al. Science 1998, 
279: 1729-1733; Nuclear, factor, erythroid derived 2, like 2 (Nfe212), Martin F, et al. Blood 
1998, 91:3459-3466; Interleukin 13 (E13), McKenzie GJ, et al. Curr Biol 1998, 8:339-340; 
Sorbitol dehydrogenase 1 (Sdhl), Ng T, et al. Diabetes 1998, 47:961-966; Guanine 
5 nucleotide binding protein, alpha 1 1 (Gnal 1), Offermanns S, et al. EMBO J 1998, 17:4304- 
4312; Estrogen receptor alpha (Estra), Ogawa S, et al. Endocrinology 1998, 139:5070-5081; 
Integrin beta 2 (Itgb2), Intracellular adhesion molecule (Icaml) and CD34 antigen, Oliveira- 
dos-Santos AJ, et al. Eur J Immunol 1998, 28:2882-2892; Angiotensin receptor lb (Agtrlb), 
Oliverio MI, et al. Proc Natl Acad Sci USA 1998, 95:15496-15501; Complement factor B 

10 (factor B), Pekna M, et al. Scand J Immunol 1998, 47:375-380; Centromere autoantigen B 
(Cenpb), Perez-Castro AV, et al. Dev Biol 1998, 201:135-143; Procollagen, type V, alpha 2 
(Col5a2) / Fibrillin 1 (Fbnl), Phelps RG, et al. Mol Med 1998, 4:356-360; Plasminigen 
activator inhibitor, type 1 (Planhl), Pinsky DJ, et al. J Clin Invest 1998, 102:919-928, 
Carmeliet P, et al. J Clin Invest 1993, 92:2756-2760; Placentae and embryos oncofetal gene 

15 (Pern), Pitman JL, et al. Dev Biol 1998, 202: 196-214; Postmeiotic segregation increased 1 
(S. cerevisiae) (Pmsl), Prolla TA, et al. Nat Genet 1998, 18:276-278; Prion protein, 
structural locus (Prn-p), Prusiner SB, et al. Proc Natl Acad Sci USA 1998, 90:10608-10612, 
Lledo P-M, et al. Proc Natl Acad Sci U S A 1996, 93:2403-2407, Sailer A, et al. Cell 1994, 
77:967-968, Weissmann C, et al. Philos Trans R Soc Lond [Biol] 1994, 343:431^133, 

20 Bueler H, et al. Cell 1993, 73: 1337-1347, Weissmann C, et al. Intervirology 1993, 35: 164- 
175; NAD (P)H:quinone oxidoreductase, Radjendirane V, et al. J Biol Chem 1998, 
273:7382-7389; Alpha tropomyosin (Tpml), Rethinasamy P, et al. Circ Res 1998, 82: 1 16- 
123; Goosecoid and Goosecoid-like (Gscl), Wakamiya M, et al. Hum Mol Genet 1998, 
7:1835-1840, Saint-Pore B, et al. Hum Mol Genet 1998, 7:1841-1849; Schlafen 1 (Slfnl); 

25 Schwarz DA, et al. Immunity 1998, 9:657-668; Nuclear factor, erythroid derived 2, 
ubiquitous (Mafk), Shavit JA, et al. Genes Dev 1998, 12:2164-2174; Microphthalmia- 
associated transcription factor (Mitf), Smith SB, et al. Exp Eye Res 1998, 66:403^110; Bone 
morphogenetic protein 6 (Bmp6), Solloway MJ, et al. Dev Genet 1998, 22:321-339; 
Phosphatidylinositol glycan, class A (Piga), Takahama Y, et al. Eur J Immunol 1998, 
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28:2159-2166; Paired-related homeobox 2 (Prx2), Berge D, et al. Development 1998, 
125:3831-3842; Prostaglandin E receptor EP1 subtype (Ptgerepl), Ushikubi F, et al. Nature 
1998, 395:281-284; Immunoglobulin kappa chain complex (Igk), van der Stoep N, et al. 
Immunity 1998, 8:743-750; Adenine phosphoribosyl transferase (Aprt), Van Sloun PP, et al. 

5 Nucleic Acids Res 1998, 26:4888^-894; Microtubule associated protein 4 (Mtap4), Voss 
AK, et al. Dev Dyn 1998, 212:258-266; ; 3-hydroxy-3-methylglutaryl-coenzyme A lyase 
(Hmgcl), Wang SP, et al. Hum Mol Genet 1998, 7:2057-2062; Fibroblast growth factor 
receptor 4 (Fgfr4), Weinstein M, et al. Development 1998, 125:3615-3623; Hepsin (Hpn), 
Wu Q, et al. J Clin Invest 1998, 101:321-326; Small inducible cytokine Al 1 (Scyal 1), 

10 Yang T, et al. Blood 1998, 92:3912-3923; Small nuclear ribonucleoprotein N (Snrpn), Yang 
T, et al. Nat Genet 1998, 19:25-31; DNA fragmentation factor, alpha subunit (Dffa), Zhang 
J, et al. Proc Natl Acad Sci USA 1998, 95:12480-12485; Early growth response 1 (Egrl), 
Zheng D, et al. Neuroscience 1998, 83:251-258; Early growth response 1 (Egrl) / Hormone 
receptor (Hmr), Zheng D, et al. Neuroscience 1998, 83:251-258; Hemochromatosis 

15 (Hfe),Zhou XY, et al. Proc Natl Acad Sci USA 1998, 95:2492-2497; Alpha tropomyosin 
(Tpml), Blanchard EM, et al. Circ Res 1997, 81: 1005-1010; tRNA phosphoserine (Trsp), 
Bosl MR, et al. Proc Natl Acad Sci U S A 1997, 94:5531-5534; Angiotensin receptor lb 
(Agtrlb), Chen X, et al. Am J Physiol 1997, 272:F299-F304; Xeroderma pigmentosum, 
complementation group C (Xpc), Cheo DL, et al. Mut Res 1997, 374: 1-9; B cell 

20 leukemia/lymphoma 6 (Bcl6), Dent AL, et al. Science 1997, 276:589-592; 

Fumarylacetoacetate hydrolase (Fah) / 4-hydroxyphenylpyruvic acid dioxygenase (Hpd), 
Endo F, et al. J Biol Chem 1997, 272:24426-24432; N-methylpurine-DNA glycosylase 
(Mpg), Engelward BP, et al. Proc Natl Acad Sci USA 1997, 94:13087-13092; Interleukin 1 
receptor, type 1 (Illrl), Glaccum MB, et al. J Immunol 1997, 159:3364-3371; N- 

25 methylpurine-DNA glycosylase (Mpg), Hang B, et al.Proc Natl Acad Sci USA 1997, 

94:12869-12874; Gamma-aminobutyric acid (GABA-A) receptor, subunit alpha 6 (Gabra6), 
Homanics GE, et al. Mol Pharmacol 1997, 51:588-596; Superoxide dismutase 2, 
mitochondrial (Sod2), Huang TT, et al. Arch Biochem Biophys 1997, 344:424-432; 
Interleukin 1 1 receptor, alpha chain 1 (111 lral), Nandurjar HH, et al. Blood 1997, 90:2148- 
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2159; Alkaline phosphatase 5 (Akp5), Narisawa S, et al. Dev Dyn 1997, 208:432-446; 
GATA-binding protein 4 (Gata4), Narita N, et al. Development 1997, 124:3755-3764; 
Lymphocyte protein tyrosine kinase (Lck) / Fyn protooncogene (Fyn); Page ST, et al. Eur J 
Immunol 1997, 27:554-562; P glycoprotein 3 (Pgy3), Schinkel AH, et al. Proc Natl Acad 
5 Sci U S A 1997, 94:4028^033; P glycoprotein 1 (Pgyl) / P glycoprotein 3 (Pgy3), Schinkel 
AH, et al., Proc Natl Acad Sci U S A 1997, 94:4028^1033; Creatine kinase, mitochondrial 1, 
ubiquitous (Ckmtl), Steeghs K, et al. J Neurosci Methods 1997, 71:29-41; T cell receptor 
gamma, variable 4 (Tcrg-V4), Sunaga S, et al. J Immunol 1997, 158:4223-4228; la- 
associated invariant chain (p31 form) (E), Takaesu NT, et al. J Immunol 1997, 158:187-199; 

10 Solute carrier family 18 (vesicular monoamine), member 2 (Slcl8a2), Takahashi N, et al. 
Proc Natl Acad Sci USA 1997, 94:9938-9943; Matrix metalloproteinase 7 (Mmp7), Wilson 
CL, et al. Proc Natl Acad Sci U S A 1997, 94: 1402-1407; Formin (Fmn),Wynshaw-Boris A, 
et al. Mol Med 1997, 3:372-384; Sypnaptophysin (Syp), Arrandale JM, et al. J Biol Chem 
1996, 271:21353-21358; Transformation related protein 53 (Trp53), Boehme SA, et al. J 

15 Immunol 1996, 156:4075-4078; Neuronal nitric oxide synthase (Nosl), Burnett AL, et al. 
Mol Medicine 1996, 2:288-296; Eph receptor A2 (Epha2), Chen J, et al. Oncogene 1996, 
12:979-988; Urokinase plasminogen activator receptor (Plaur); Dewerchin M, et al. J Clin 
Invest 1996, 97:870-878; Growth differentiation factor 9 (Gdf9), Dong J, et al. Nature 1996, 
383:531-535; Externally regulated phosphatase (Ptpnl6), Dorfman K, et al. Oncogene 1996, 

20 13:925-93 1 ; Tenascin C (Tnc), Forsberg E, et al. Proc Natl Acad Sci U S A 1996, 93:6594- 
6599; Integrin alpha 1 (Itgal), Gardner H, et al. Dev Biol 1996, 175:301-313; FBJ 
osteosarcoma oncogene B (Fosb), GrudaMC, et al. Oncogene 1996, 12:2177-2185; Breast 
cancer 1 (Brcal), HakemR, et al. Cell 1996, 85:1009-1023; Megakaryocyte-associated 
tyrosine kinase (Matk), Hamaguchi I, et al. Biochem Biophys Res Commun 1996, 224:172- 

25 179; Apolipoprotein B editing complex 1 (Apobecl), Hirano K-I, et al. J Biol Chem 1996, 
271:9887-9890; Carboxyl ester lipase, Howies PN, et al. J. Biol. Chem. 1996, 271:7196- 
7202; Nuclear factor, erythroid derived 2, ubiquitous (Nfe2u), Kotkow KJ, Orkin SH. Proc 
Natl Acad Sci U S A 1996, 93:3514-3518; Retinoid X receptor gamma (Rxrg), Krezel W, et 
al. Proc Natl Acad Sci U S A 1996, 93:9010-9014; Early growth response 1 (Egrl), Lee SL, 
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et al. Mol Cell Biol 1996, 16:4566-4572; Adrenergic receptor, alpha 2b (Adra2b), Link RE, 
et al. Science 1996, 273:803-805; Angiotensin receptor la (Agtrla), Matsusaka T, et al. J 
Clini Invest 1996, 98:1867-1877; Interleukin 3 receptor, beta chain 1 (D3rb2), Nicola NA, et 
al. Blood 1996, 87:2665-2674; Transformation related protein 53 (Trp53), Ohashi M, et al. 
5 Jpn J Cancer Res 1996, 87:696-701; Leukemia-associated gene (Lag), Schubart U K, et al. J 
Biol Chem 1996, 271:14062-14066; Glutathione peroxidase 1 (Gpxl), Spector A, et al. Exp 
Eye Res 1996, 62:521-540; Arachidonate 12-lipoxygenase, leukocyte (Aloxl21), Sun D, 
Funk CD. J Biol Chem 1996, 271:24055-24062; Complement component 3 (C3), Sylvestre 
D, et al. J Exp Med 1996, 184:2385-2392; CD30 antigen (Cd30), Texido G, et al. Eur J 

10 Immunol 1996, 26: 1966-1969; Fc receptor, IgE, low affinity n, alpha polypeptide (Fcer2a), 
Immunoglobulin heavy chain 5 (delta-like heavy chain) (Igh-5), Interleukin 4 (IL4) and 
Terminal-deoxynucleotidyl transferase, Texido G, et al. Eur J Immunol 1996, 26:1966- 
1969; Apolipoprotein A-II (Apoa2), Weng W, Breslow JL. Proc Natl Acad Sci U S A 1996, 
93:14788-14794; Amyloid beta (A4) precursor protein (App), Zheng H, et al. Ann N Y 

15 Acad Sci 1996, 777:421-426; cAMP responsive element binding protein 1 (Crebl), Blendy 
JA, et al. Brain Res 1995, 681:8-14; Bradykinin receptor, beta 2 (Bdkrb2), Borkowski JA, et 
al. J Biol Chem 1995, 270:13706-13710; Growth factor response protein (Gfrp), Crawford 
PA, et al. Mol Cell Biol 1995, 15:4331^336; Ciliary neurotrophic factor (Cntf), de Chiara 
TM, et al. Cell 1995, 83:313-322; Cyclin dependent kinase inhibitor 1A (P21) (Cdknla); 

20 Deng C, et al. Cell 1995, 82:675-684; Granzyme A (Gzma), Ebnet K, et al. EMBO J 1995, 
14:4230-4239; Very low density lipoprotein receptor (Vldlr), Frykman PK, et al. Proc Natl 
Acad Sci U S A 1995, 92:8453-8457; Apolipoprotein E (Apoe) / Apolipoprotein A-I 
(Apoal), Goodrum JF, et al. J Neurobiol 1995, 64:408-4-16; Nitric oxide synthase 1, 
neuronal (Nosl), Ichinose F, et al. Anesthesiology 1995, 83:101-108; Nitric oxide synthase 

25 2, inducible, macrophage (Nos2), Laubach VE, et al. Proc Natl Acad Sci U S A 1995, 

92:10688-10692; Peroxisome proliferator activated receptor alpha (Ppara), Lee SS-T, et al. 
Mol Cell Biol 1995, 15:3012-3022; Growth factor response protein (Gfrp), Lee SL, et al. 
Science 1995, 269:532-535; H19 fetal liver mRNA (H19) / Insulin-like growth factor 2 
(Igf2), Leighton PA, et al. Nature 1995, 375:34-39; Retinoic acid receptor beta (RAR-beta), 
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Luo J, et al. Mech Dev 1995, 53:61-71; Metallothionein 1 (Mtl) / Metallothionein 2 (Mt2), 
Philcox JC, et al. Biochem J 1995, 308:543-546; Heme oxygenase (decycling) 2 (Hmox2), 
Poss KD, et al. Neuron 1995, 15:867-873; Hl-0 histone (Hlfv), Sirotkin AM, et al. Proc 
Natl Acad Sci U S A 1995, 92:6434-6438; Creatine kinase, mitochondrial 1, ubiquitous 
5 (Ckmtl), Steeghs K, et al. Biochim Biophys Acta 1995, 1230: 130-138; Tenascin C (Tnc), 
Steindler DA, et al. J Neurosci 1995, 15:1971-1983; la-associated invariant chain (Ii), 
Takaesu NT, et al. Immunity 1995, 3:385-396; Neuroblastoma ras oncogene (Nras), 
Umanoff H, et al. Proc Natl Acad Sci USA 1995, 92:1709-1713; Receptor-associated 
protein of the synapse, 43 kDa (Rapsn), Willnow TE, et al. Proc Natl Acad Sci USA 1995, 

10 92:4537-4541; Vitronectin (Vtn), Zheng X, et al. Proc Natl Acad Sci U S A 1995, 

92:12426-12430; Preproacrosin (Acr), BabaT, et al. J Biol Chem 1994, 269:31845-31849; 
Vimentin (Vim), Colucci GE, et al. Cell 1994, 79:679-694; Tumor necrosis factor receptor 
superfamily, member lb (Tnfrsflb), Erickson SL, et al. Nature 1994, 372:560-563; Cellular 
retinoic acid binding protein I (Crabpl), Gorry P, et al. Proc Natl Acad Sci USA 1994, 

15 91 :9032-9036; cAMP responsive element binding protein 1 (Crebl), Hummler F, et al. Proc 
Natl Acad Sci USA 1994, 91:5647-5651; Pore forming protein (Pfp), Kagi D, et al. Nature 
1994, 369:31-37; Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal 
myristylation sit (Srms), Kohmura N, et al. Mol Cell Biol 1994, 14:6915-6925; CD3 
polypeptide zeta (Cd3z) / Solute carrier family 22, member 1 (Slc22al), Koyasu S, et al. 

20 EMBO J 1994, 13:784-797; Pore forming protein (Pfp), Lowin B, et al. Proc Natl Acad Sci 
USA 1994, 91:1 1571-1 1575; Retinoic acid receptor, beta (Rarb), Retinoic acid receptor 
beta2 (RARbeta2), Mendelsohn C, et al. Dev Biol 1994, 166:246-258; Transthyretin (Ttr), 
Palha JA, et al. J Biol Chem 1994, 269:33135-33139; Procollagen, type X, alpha 1 
(CollOal), Rosati R, et al. Nature Genet 1994, 8:129-135; P glycoprotein 3 (Pgy3), 

25 Schinkel AH, et al. Cell 1994, 77:491-502; Yamaguchi sarcoma viral (v-yes) oncogene 
homolog (Yes), Stein PL, et al. Genes Dev 1994, 8:1999-2007; Fc receptor, IgE, high 
affinity H, alpha polypeptide (Fcer2a), Stief A, et al. J Immunol 1994, 152:3378-3390; Pore 
forming protein (Pfp), Walsh CM, et al. Proc Natl Acad Sci USA 1994, 91:10854-10858; 
CD2 antigen (Cd2), Evans CF, et al. J Immunol 1993, 151:6259-6264; Mannose-6- 



PATENT 
PXE-012.US 



39 

phosphate receptor, cation dependent (M6pr), Koster A, et al. EMBO J 1993, 12:5219-5223; 
Retinoic acid receptor, alpha (Rara), Li E, et al. Proc Natl Acad Sci USA 1993, 90:1590- 
1594, Lufkin T, et al. Proc Natl Acad Sci USA 1993, 90:7225-7229; Retinoic acid receptor, 
gamma (Rarg), Lohnes D, et al. Cell 1993, 73:643-658; Tumor necrosis factor receptor 1 
(TNF-R-1) (Tnfrl), Pfeffer K, et al. Cell 1993, 73:457-467; Lectin, galactose binding, 
soluble 1 (Lgalsl), Poirier F, Robertson EJ. Development 1993, 1 19:1229-1236; Synapsin I 
(Synl), Rosahl TW, et al. Cell 1993, 75:661-670; Tumor necrosis factor receptor 1 (Tnfrl), 
Rothe J, et al. Nature 1993, 364:798-802; Beta-2 microglobulin (B2m), Correa I, et al. Proc 
Natl Acad Sci USA 1992, 89:653-657; CD2 antigen (Cd2), Killeen N, et al. EMBO J 1992, 
1 1:4329^1336; Apolipoprotein E (Apoe), Piedrahita JA, et al. Proc Natl Acad Sci USA 
1992, 89:4471-4475; Myogenic differentiation 1 (Myodl), Rudnicki MA, et al. Cell 1992, 
71:383-390; Tenascin C (Tnc), Saga Y, et al. Genes Dev 1992, 6:1821-1831; Beta-2 
microglobulin (B2m), Sanjuan N, et al. J Virol 1992, 66:4587^590; Neuroblastoma myc- 
related oncogene 1 (Nmycl), Stanton BR, et al. Genes Dev 1992, 6:2235-2247; and 
Hemoglobin alpha chain complex (Hba), Popp RA, et al. Genetics 1983, 105:157-167. 

Some preferred single-copy, non-essential genes with no phenotypes of the present 
invention include, but are not limited to, the following: Moesin (Msn), Doi Y., et al., J Biol 
Chem 1999, 274:2315-2321; Plasminogen activator inhibitor, type H (Planh2) and Planhl, 
Dougherty K.M., Proc Natl Acad Sci USA 1999, 96:686-691; Nuclear receptor coactivator 
1 (Ncoal), Qi C, et al. (1999) Proc Natl Acad Sci USA 96:1585-1590; Nuclear factor of 
kappa light chain gene enhancer in B-cells inhibitor, alpha and beta (Nfkbia and Nfkbib), 
Cheng JD, et al. (1998) J Exp Med 6:1055-1062; H19 fetal liver mRNA (H19), Jones BK, et 
al. (1998) Genes Dev 12:2200-2207; Prion protein (Prnp), Lipp HP, et al. Behav Brain Res 
1998, 95:47-54; Centromere autoantigen B (Cenpb), Perez-Castro AV, et al. Dev Biol 1998, 
201:135-143; Placentae and embryos oncofetal gene (Pern), Pitman JL, et al. Dev Biol 
1998, 202:196-214; Externally regulated phosphatase (Ptpnl6), Dorfman K, et al. Oncogene 
1996, 13:925-931; Transformation related protein 53 (Trp53), Ohashi M, et al. Jpn J Cancer 
Res 1996, 87:696-701; Hl-0 histone (Hlfv), Sirotkin AM, et al. Proc Natl Acad Sci U S A 
1995, 92:6434-6438; Creatine kinase, mitochondrial 1, ubiquitous (Ckmtl), Steeghs K, et 



PATENT 
PXE-012.US 



40 



al. Biochim Biophys Acta 1995, 1230:130-138; Neuroblastoma ras oncogene (Nras), 
Umanoff H, et al. Proc Natl Acad Sci USA 1995, 92:1709-1713; Vitronectin (Vtn), Zheng 
X, et al. Proc Natl Acad Sci U S A 1995, 92:12426-12430; Vimentin (Vim), Colucci GE, et 
al. Cell 1994, 79:679-694; Cellular retinoic acid binding protein I (Crabpl), Gorry P, et al. 
5 Proc Natl Acad Sci USA 1994, 91:9032-9036; Retinoic acid receptor beta2 (RARbeta2), 
Mendelsohn C, et al. Dev Biol 1994, 166:246-258; Retinoic acid receptor, alpha (Rara), Li 
E, et al. Proc Natl Acad Sci USA 1993, 90:1590-1594, Lufkin T, et al. Proc Natl Acad Sci 
USA 1993, 90:7225-7229; Lectin, galactose binding, soluble 1 (Lgalsl), Poirier F, 
Robertson EJ. Development 1993, 119:1229-1236; Myogenic differentiation 1 (Myodl), 
10 Rudnicki MA, et al. Cell 1992, 71:383-390; and Tenascin C (Tnc), Saga Y, et al. Genes Dev 
1992, 6:1821-1831. 

In view of the guidance of the present specification, one of ordinary skill in the art 
can select similar, suitable, single-copy, non-essential genes in mice and other cell 
types/organisms. 

15 

Assembly of Targeting Cassettes 

The targeting cassettes described herein can be constructed utilizing methodologies 
known in the art of molecular biology (see, for example, Ausubel or Maniatis) in view of the 
teachings of the specification. As described above, the targeting constructs are assembled 

20 by inserting, into a suitable vector backbone, polynuclotides encoding a reporter, such as a 
light-generating protein, e.g., a luciferase gene, operably linked to a promoter of interest; a 
sequence encoding a positive selection marker; and, optionally a sequence encoding a 
negative selection marker. In addition, the targeting cassette contains insertion sites such 
that sequences targeting a single-copy, non-essential gene can be readily inserted to flank 

25 the sequence encoding positive selection marker and luciferase-encoding sequence. 

A preferred method of obtaining polynucleotides, suitable regulatory sequences (e.g., 
promoters) is PCR. General procedures for PCR as taught in MacPherson et al., PCR: A 
Practical Approach, (IRL Press at Oxford University Press, (1991)). PCR conditions for 
each application reaction may be empirically determined. A number of parameters influence 
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the success of a reaction. Among these parameters are annealing temperature and time, 
extension time, Mg2+ and ATP concentration, pH, and the relative concentration of primers, 
templates and deoxyribonucleotides. Exemplary primers are described below in the 
Examples. After amplification, the resulting fragments can be detected by agarose gel 
5 electrophoresis followed by visualization with ethidium bromide staining and ultraviolet 
illumination. 

In one embodiment, PCR can be used to amplify fragments from genomic libraries. 
Many genomic libraries are commercially available. Alternatively, libraries can be 
produced by any method known in the art. Preferably, the organism(s) from which the DNA 

10 is has no discernible disease or phenotypic effects. This isolated DNA may be obtained 
from any cell source or body fluid (e.g., ES cells, liver, kidney, blood cells, buccal cells, 
cerviovaginal cells, epithelial cells from urine, fetal cells, or any cells present in tissue 
obtained by biopsy, urine, blood, cerebrospinal fluid (CSF), and tissue exudates at the site of 
infection or inflammation). DNA is extracted from the cells or body fluid using known 

15 methods of cell lysis and DNA purification. The purified DNA is then introduced into a 
suitable expression system, for example a lambda phage. 

Another method for obtaining polynucleotides, for example, short, random 
nucleotide sequences, is by enzymatic digestion. As described below in the Examples, short 
DNA sequences generated by digestion of DNA from vectors carrying genes encoding 

20 lucif erase (yellow green or red). 

Polynucleotides are inserted into vector genomes using methods known in the art. 
For example, insert and vector DNA can be contacted, under suitable conditions, with a 
restriction enzyme to create complementary or blunt ends on each molecule that can pair 
with each other and be joined with a ligase. Alternatively, synthetic nucleic acid linkers can 

25 be ligated to the termini of a polynucleotide. These synthetic linkers can contain nucleic 
acid sequences that correspond to a particular restriction site in the vector DNA. Other 
means are known and, in view of the teachings herein, can be used. 
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The final constructs can be used immediately (e.g., for introduction into ES cells), or 
stored frozen (e.g., at -20°C) until use. Preferably, the constructs are linearized prior to use, 
for example by digestion with suitable restriction endonucleases. 

Transgenic Animals 

The targeting constructs containing the luciferase genes are introduced into a 
pluripotent cell (e.g., ES cell, Robertson, E. J., In: Current Communications in Molecular 
Biology, Capecchi, M. R. (ed.), Cold Spring Harbor Press, Cold Spring Harbor, N.Y. 
(1989), pp. 39-44). Suitable ES cells may be derived or isolated from any species or from 
any strain of a particular species. Although not required, the pluripotent cells are typically 
derived from the same species as the intended reciepient. ES cells may be obtained from 
commercial sources, from International Depositories (e.g., the ATCC) or, alternatively, may 
be obtained as described in Robertson, E. J., supra. Examples of clonally-derived ES cells 
lines include 129/SVJ ES cells, RW-4 and C57BL/6 ES cells (Genome Systems, Inc.). 

ES cells are cultured under suitable conditions, for example, as described in Ausubel 
et al., section 9.16, supra. Preferably, ES cells are cultured on stomal cells (such as STO 
cells (especially SNC4 STO cells) and/or primary embryonic fibroblast cells) as described 
by E. J. Robertson, supra, pp 71-1 12. Culture media preferably includes leukocyte 
inhibitory factor ("lif *) (Gough, N. M. et al, Reprod. Fertil. Dev. 1:281-288 (1989); 
Yamamori, Y. et al., Science 246:1412-1416 (1989), which appears to help keep the ES cells 
from differentiating in culture. Stomal cells transformed with the gene encoding lif can also 
be used. 

The targeting constructs are introduced into the ES cells by any method which will 
permit the introduced molecule to undergo recombination at its regions of homology, for 
example, micro-injection, calcium phosphate transformation, or electroporation (Toneguzzo, 
F. et al., Nucleic Acids Res. 16:5515-5532 (1988); Quillet, A. et al., J. Immunol. 141:17-20 
(1988); Machy, P. et al., Proc. Natl. Acad, Sci. (U.S.A.) 85:8027-8031 (1988)). The 
construct to be inserted into the ES cell must first be in the linear form. Thus, if the 
knockout construct has been inserted into a vector as described above, linearization is 
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accomplished by digesting the DNA with a suitable restriction endonuclease selected to cut 
only within the vector sequence and not within the knockout construct sequence. If the ES 
cells are to be electroporated to insert the construct, the ES cells and construct DNA are 
exposed to an electric pulse using an electroporation machine and following the 
5 manufacturer's guidelines for use. After electroporation, the ES cells are typically allowed to 
recover under suitable incubation conditions. The cells are then cultured under conventional 
conditions, as are known in the art, and screened for the presence of the construct. 

Screening and selection of those cells into which the targeting construct has been 
integrated can be achieved using the positive selection marker and/or the negative selection 

10 marker in the construct. In preferred embodiments, the construct contains both positive and 
negative selection markers. In one aspect, methods which rely on expression of the 
selection marker are used, for example, by adding the appropriate substrate to select only 
those cells which express the product of the positive selection marker or to eliminate those 
cells expressing the negative selection marker. For example, where the positive selection 

15 marker encodes neomycin resistance, G418 is added to the transformed ES cell culture 
media at increasing dosages. Similarly, where the negative selection marker is used, a 
suitable substrate (e.g., gancyclovir if the negative selection marker encodes HSV-TK) is 
added to the cell culture. Either before or after selection using the appropriate substrate, the 
presence of the positive and/or negative selection markers in a recipient cell can also be 

20 determined by others methods, for example, hybridization, detection of radiolabeled 

nucleotides, PCR and the like. In preferred embodiments, cells having integrated targeting 
constructs are first selected by adding the appropriate substrate for the positive and/or 
negative selection markers. Cells that survive the selection process are then screened by 
other methods, such as PCR or Southern blotting, for the presence of integrated sequences. 

25 After suitable ES cells containing the construct in the proper location have been 

identified, the cells can be inserted into an embryo, preferably a blastocyst. The blastocyts 
are obtained by perfusing the uterus of pregnant females. In one embodiment, the blastocyts 
are obtained from, for example, the FVB/N strain of mice and the ES cells are obtained 
from, for example, the C57BL/6 strain of mice. Suitable methods for accomplishing this are 
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known to the skilled artisan, and are set forth by, e.g., Bradley et al., (1992) Biotechnology, 
10:534-539. Insertion into the embryo may be accomplished in a variety of ways known to 
the skilled artisan, however a preferred method is by microinjection. For microinjection, 
about 10-30 ES cells are collected into a micropipet and injected into embryos that are at the 

5 proper stage of development to permit integration of the foreign ES cell containing the 
construct into the developing embryo. The suitable stage of development for the embryo 
used for insertion of ES cells is species dependent, in mice it is about 3.5 days. 

While any embryo of the right stage of development is suitable for use, it is preferred 
that blastocysts are used. In addition, preferred blastocysts are male and, furthermore, 

10 preferably have genes encoding a coat color that is different from that encoded by the genes 
ES cells. In this way, the offspring can be screened easily for the presence of the knockout 
construct by looking for mosaic coat color (indicating that the ES cell was incorporated into 
the developing embryo). Thus, for example, if the ES cell line carries the genes for black 
fur, the blastocyst selected will carry genes for white or brown fur. 

15 After the ES cell has been introduced into the blastocyst, the blastocyst is typically 

implanted into the uterus of a pseudopregnant foster mother for gestation. Pseudopregnant 
females are prepared by mating with vasectomized males of the same species and successful 
implantation usually must occur within about 2-3 days of mating. 

Offspring are screened initially for mosaic coat color where the coat color selection 

20 strategy has been employed. Southern blots and/or PCR may also be used to determine the 
presence of the sequences of interest. Mosaic (chimeric) offspring are then bred to each 
other to generate homozygous animals. Homozygotes and heterozygotes may be identified 
by Southern blotting of equivalent amounts of genomic DNA from mice that are the product 
of this cross, as well as mice that are known heterozygotes and wild type mice. 

25 Alternatively, Northern blots can be used to probe the mRNA to identify the presence or 
absence of transcripts encoding either the replaced gene, the luciferase gene, or both. In 
addition, Western blots can be used to assess the level of expression of the luciferase protein 
with an antibody against the luciferase gene product. Finally, in situ analysis (such as fixing 
the cells and labeling with antibody) and/or FACS (fluorescence activated cell sorting) 
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analysis of various cells from the offspring can be conducted using suitable (e.g., anti- 
luciferase) antibodies to look for the presence or absence of the targeting construct. 

In one embodiment of the present invention, the animals are from the C57BL/6 
mouse strain. This strain develops a variety of tumors and has been used to develop a 
5 number of tumor cells lines, for example, B 16 melanoma cells (including, B16F10, B16D5, 
and B16F1), Lewis lung carcinoma cells (including, LLC, LLC-h59), T241 mouse 
fibrosarcoma cells, RM-1 and pTC2 mouse prostate cancer cells, and MCA207 mouse 
sarcoma cells. These cell lines have been extensively used for in vivo tumor biology studies 
after injection into C57BL/6 mice. The generated targeted transgenic mice in the Examples 
^ 10 are in C57BL/6 genetic background and these animals are suitable for injection or 
£ implantation of such tumor cells, as well as other tumor cells described in literature that are 

\n immunocompatent for C57BL/6 mice. Thus, the transgenic animals can then be used, for 

example, to monitor, in vivo, tumor progression (e.g., growth) and the efficacy of therapies 
fo on tumor regression. For example, where the transgenic animal is tumor-susceptible, it is 

L 15 monitored for expression of a reporter, e.g., luciferase, which is indicative of tumorigenesis 
f U and/or angiogenesis. The monitoring of expression of luciferase reporter expression 

cassettes using non-invasive whole animal imaging has been described (Contag, C. et al, 
U.S. Patent No. 5,650,135, July 22, 1997, herein incorporated by reference; Contag, P., et al, 
Nature Medicine 4(2):245-247, 1998; Contag, C, et al, OSA TOPS on Biomedical Optical 
20 Spectroscopy and Diagnostics 3:220-224, 1996; Contag, C.H., et al, Photochemistry and 
Photobiology 66(4):523-531, 1997; Contag, C.H., et al, Molecular Microbiology 18(4):593- 
603, 1995). Such imaging typically uses at least one photo detector device element, for 
example, a charge-coupled device (CCD) camera. 

The transgenic animals described herein can also be used to determine the effect of 
25 an analyte (e.g., therapy), for example on tumor progression where the promoter induces 
luciferase expression when a tumor develops. Methods of administration of the analyte 
include, but are not limited to, injection (subcutaneously, epidermally, intradermally), 
intramucosal (such as nasal, rectal and vaginal), intraperitoneal, intravenous, oral or 
intramuscular. Other modes of administration include oral and pulmonary administration, 
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suppositories, and transdermal applications. Dosage treatment may be a single dose 
schedule or a multiple dose schedule. For example, the analyte of interest can be 
administered over a range of concentration to determine a dose/response curve. The analyte 
may be administered to a series of test animals or to a single test animal (given that response 
5 to the analyte can be cleared from the transgenic animal). 

VEGF and VEGFR Genes 

In one aspect the present invention relates to the isolation and characterization of the 
mouse VEGFR-2 gene promoter. This section describes some information related to the 
O 10 VEFG and VEGFR gene families. Alternative names for some of these genes are as 
J; follows: VEGF (vascular endothelial growth factor)is also named VPF (vascular 

J5i permeability factor); VEGFR- 1 is also named FLT1; VEGFR-2 is also named KDR/FLK1; 

and VEGFR-3 is also named FLT4. 
to VEGF is a homodimeric 45 kDa (monomer 23 kDa) protein. VEGF has five 

! 15 isoforms of which VEGF 165 and VEGF121 are the most abundant. Both are ligands for 

PS3SS 

VEGFR-2 as well as VEGFR- 1 (Soker, S., et al., JBC 271:5761-67, 1996). VEGF 165 is the 
only VEGF isoform that binds to Neuropillin-1 (Soker, S., et al., Cell 92:735-745, 1998). 
VEGF is extremely unstable - its half life in circulation is only 3 minutes (Ferrara, N., et al., 
Nature 380:439-442, 1996; Ferrara, N., et al., Endocr Rev 18:4-25, 1997). 
20 VEGF-B is 43% (aa) identical to VEGF and exists as homodimers. It can also form 

heterodimers with VEGF (Olofsson, B., et al., Proc Natl Acad Sci USA 93:2576-81, 1996). 
VEGF-B is a ligand for VEGFR- 1 (Olofsson, B., et al., Proc Natl Acad Sci USA 95:1 1709- 
14, 1998). 

VEGF-C is 30% (aa) identical to VEGF. The mature VEGF-C is 23 kDa, the 
25 precursor protein is 35.8 kDa. VEGF-C is a ligand for VEGFR-3 as well as VEGFR-2. It 
induces autophosphorylation of both receptors (Joukov, V., et al, EMBO J 15:290-298, 
1996). 

VEGF-D is 31% (aa) identical to VEGF 165 and 48% (aa) identical to VEGF-B. The 
mature VEGF-D is approximately 22 KDa. VEGF-D is a ligand for VEGFR-3 as well as 



PATENT 
PXE-012.US 



47 

VEGFR-2. It induces autophosphorylation of both receptors (Achen, M.G., et al., 1998 Proc 
Natl Acad Sci USA 95:548-53, 1998). 

PIGF is 46% identical (aa) to VEGF (Maglione, D., et al., Proc Natl Acad Sci 
88:9267-71, 1991) and can form heterodimers with VEGF ((Disalvo, J., et al., JBC 

270:7717-23, 1995). 

VEGFR-1 is an approximately 180 KDa tyrosine kinase receptor for VEGF-B 
(Olofsson, B., et al., Proc Natl Acad Sci USA 95:11709-14, 1998) and VEGF (de Vries, C, 
et al., Science 255:989-91, 1992) and PIGF (Park, J.E., et al., J Biol Chem 269:25646-54, 
1994). 

VEGFR-2 is an approximately 200 KDa tyrosine kinase receptor for VEGF (Terman, 
B.I., et al., Oncogene Sept 6(9):1677-83, 1991), VEGF-C (Joukov, V., et al., EMBO J 
15:290-298, 1996), and VEGF-D (Achen, M.G., et al., 1998 Proc Natl Acad Sci USA 
95:548-53, 1998). 

VEGFR-3 is a tyrosine kinase receptor (Pajusola, K., et al., Cancer Res 52:5738-43, 
1992) on lymphatic EC for VEGF-C (Dumont, D.J., et al., Science 282:946-949, 1998) and 
VEGF-D (Achen, M.G., et al., 1998 Proc Natl Acad Sci USA 95:548-53, 1998). VEGFR-3 
has a processed mature form of about 125 kDa, and an unprocessed form of about 195 kDa 
(Achen, M.G., et al., 1998 Proc Natl Acad Sci USA 95:548-53, 1998). 

Neuropillin-1 is an approximately 130 KDa receptor tyrosine kinase. It binds 
VEGF 165, but not VEGF 121 (Soker, S., et al., Cell 92:735-745, 1998). 

Expression of many of these genes has been evaluated in adults. A summary of 
information relating to expression follows here. 

VEGF has an approximately 3.7 kb transcript. It is expressed in multiple human 
tissues, including heart, skeletal muscle and prostate. In mouse, VEGF is mainly expressed 
in heart, lung and kidney. The rest of the human or mouse tissues, including brain and testis, 
do not express detectable or significant level of VEGF (Olofsson, B., et al., Proc Natl Acad 
Sci USA 93:2576-81, 1996). In another study, it was shown that VEGF is highly expressed 
in epithelial cells of lung alveoli, renal glomeruli and adrenal cortex and in cardiac myocytes 
(Berse, B., MCB 3:211-20, 1992). 
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VEGF-B has an approximately 1.4 kb transcript. It is expressed in a majority of 
human and mouse tissues. In human, VEGF-B is most prominently expressed in heart, 
skeletal muscle, pancreas, brain and prostate. In mouse, VEGF-B is mostly expressed heart, 
skeletal muscle, brain and kidney. Liver does not appear to express a significant level of 
5 VEGF-B in either humans or mice. VEGF-B and VEGF are co-expressed in many human 
tissues, such as heart, skeletal muscle, pancreas and prostate. In general, VEGF-B is more 
abundantly expressed than VEGF. VEGF-B can act as an endothelial cell growth factor 
(Olofsson, B., et al., Proc Natl Acad Sci USA 93:2576-81, 1996). 

VEGF-C has an approximately 2.4 kb transcript that is expressed in multiple human 
Q 10 tissues, most prominently in heart, skeletal muscle, placenta, ovary, small intestine, pancreas 
I: and prostate. Several tissues, including brain and liver, do not appear to express detectable 

p? levels of VEGF-C (Joukov, V., et al., EMBO J 15:290-298, 1996). 

*0 VEGF-D has an approximately 2.3 kb transcript that is expressed in multiple human 

m tissues, most prominently in heart, skeletal muscle, lung, colon and small intestine. Several 

* s 15 tissues, including brain, liver, placenta, do not appear to express detectable levels of VEGF- 
fij D (Achen, M.G., et al., 1998 Proc Natl Acad Sci USA 95:548-53, 1998). 

jm VEGFR-1 appears to be endothelial cell specific (Peters, K.G., et al., Proc Natl Acad 

*f Sci 90:8915-19, 1993). VEGFR-1 cDNA is approximately 7.7 kb and encodes a protein of 

1338 aa. It was expressed in a variety of normal tissues of adult rat (Shibuya, M., et al., 
20 Oncogene 5:519-24, 1990). In a glioma model of tumor angiogenesis, both VEGFR-1 and 
VEGFR-2 are specifically expressed in Ecs that have penetrated into the tumor, but are 
absent from Ecs in the normal brain tissues. VEGF expression was detectable in glioma cells 
along necrotic edge (Plate, K.H., et al., Cancer Research 53:5822-27, 1993). 

VEGFR-2 is expressed as an approximately 7 kb transcript (Terman, B.L, et al., 
25 Oncogene Sept 6(9): 1677-83, 1991) that appears to be endothelial cell specific. VEGFR-2 
is expressed ubiquitously in many tissues, including heart, placenta, lung and kidney. The 
expression levels of VEGFR-2 are relatively low in these tissues compared with neuropillin 
expression. Brain does not appear to express detectable levels of VEGFR-2 (Soker, S., et al., 
Cell 92:735-745, 1998). In situ hybridization analysis revealed a specific association of 
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VEGFR-2 with endothelial cells at all stages of mouse development. It is abundant in 
proliferating endothelial cells of vascular sprouts and branching vessels of embryonic and 
early postnatal brain, but were drastically reduced in adult brain, where proliferation has 
ceased (Millauer, B., Cell 72:835-46, 1993). 
5 VEGFR-3 is expressed as approximately 5.8 kb and 4.5 kb mRNAs. Most fetal 

tissues expressed VEGFR-3, with spleen, brain intermediate zone, and lung showing the 
highest levels. It does not appear to be expressed in the endothelial cells of blood vessels 
(Pajusola, K., et aL, Cancer Res 52:5738-43, 1992). During embryonic development, 
VEGFR-3 is expressed in blood vessels but become largely restricted to the lymphatic 
10 endothelium postnatally (Kaipainen, A., et aL, Proc Natl Acad Sci USA 92: 3566-3570, 
1995). 

Neuropillin-1 is expressed in both endothelial cells and many types of tumor cells as 
an approximately 7 kb transcript. Most tissues express high level of Neuropillin-1, 
especially in heart and placenta. Skeletal muscle, pancreas, lung and kidney also express 

15 high level of Neuropillin-1. Brain does not appear to express detectable levels of 
Neuropillin-1 (Soker, S., et al., Cell 92:735-745, 1998). 

Some functions of these genes have been evaluated and are as follows. 
VEGF is a specific mitogen for EC in vitro and a potent angiogenic factor in vivo. In 
vitro, VEGF binds and induces autophosphorylation of VEGFR-2 and VEGFR-1, but the 

20 mitogenic response is mediated only through VEGFR-2 (Waltenberger, J., JBC 269:26988- 
95, 1994). VEGF functions as a survival factor for newly formed vessels during 
developmental neovascularization, possibly through mediating interaction of endothelial 
cells with underlying matrix, but is not required for maintenance of mature vessels 
(Benjamin, L.E., et al., Proc Natl Acad Sci 94:8761-66, 1997). In embryogenesis, VEGF 

25 and VEGFR-2 interaction induces the birth and proliferation of endothelials (Hanahan, D., 
Science 277:48-50, 1997). Binding of VEGF to VEGFR-1 elicits endothelial cell-cell 
interactions and capillary tube formation, a process that follows closely proliferation and 
migration of endothelial cells (Hanahan, D., Science 277:48-50, 1997). In a tumorigenesis 
study, it was shown that VEGF is critical for the initial s.c. growth of T-47D breast 



PATENT 
PXE-012.US 



50 



carcinoma cells transplanted into nude mice, whereas other angiogenic factors such as bFGF 
can compensate for the loss of VEGF after the tumors have reached a certain size (Yoshiji, 
H., et al., Cancer Research 57:3924-28, 1997). VEGF is a major mediator of aberrant 
endothelial cells (EC) proliferation and vascular permeability in a variety of human 

5 pathologic situation, such as tumor angiogenesis, diabetic retinopathy and rheumatoid 

arthritis (Benjamin, L.E., et al., Proc Natl Acad Sci 94:8761-66, 1997, Soker, S., et al., Cell 
92:735-745, 1998). VEGF induces expression of plasminogen activator (PA), PA inhibitor 
1 (PAI-1), MMP, and interstitial collagenase in EC. These findings are consistent with the 
proangiogenic activities of VEGF. VEGF promotes expression of VCAM-1 and ICAM-1 in 

10 EC, thus may facilitate the adhesion of activated NK cells to EC. VEGF may promote 
monocyte chemotaxis (Pepper, M.S., et al, BBRC 181:902-906, 1991; Ferrara, N., et al, 
Endocr Rev 18:4-25, 1997). Tumors are believed to be the principal source of VEGF. A 
correlation has been observed between VEGF expression and vessel density in human breast 
tumors, renal cell carcinoma and colon cancer (Fong, T.A.T., et al., Cancer Res 59:99-106, 

15 1999). VEGF and PGF expressions were significantly upregulated in 96% and 91% of 
hypervascular renal carcinoma tissues compared with adjacent normal kidney tissues 
(Takahashi, A., et al., Cancer Res 54:4233-7, 1994). 

VEGF-B is a mitogen for EC and may be involved in angiogenesis in muscle and 
heart (Olofsson, B., et al., Proc Natl Acad Sci USA 93:2576-81, 1996). In vitro, binding of 

20 VEGF-B to its receptor VEGFR-1 leads to increased expression and activity of urokinase- 
type plasminogen activator and plasminogen activator inhibitor, suggesting a role for 
VEGF-B in the regulation of extracellular matrix degradation, cell adhesion, and migration 
(Olofsson, B., et al., Proc Natl Acad Sci USA 95:1 1709-14, 1998). 

VEGF-C may regulate angiogenesis of lymphatic vasculature, as suggested by the 

25 pattern of VEGF-C expression in mouse embryos (Kukk, E., et al., Development 122:3829- 
37, 1996). Although VEGF-C is also a ligand for VEGFR-2, the functional significance of 
this potential interaction is unknown. Overexpression of VEGF-C in the skin of transgenic 
mice resulted in lymphatic, but not vascular, endothelial proliferation and vessel 
enlargement, suggesting the major function of VEGF-C is through VEGFR-3 rather than 
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VEGFR-2 (Jeltsch, M., et al., Science 276:1423-5, 1997). Using the CAM assay, VEGF and 
VEGF-C were shown to be specific angiogenic and lymphangiogenic growth factors, 
respectively (Oh, S.J., et al, Del Biol 188:96-109, 1997). 

VEGF-D is a mitogen for EC. VEGF-D can also activate VEGFR-3. It is possible 
5 that VEGF-D could be involved in the regulation of growth and/or differentiation of 
lymphatic endothelium (Achen, M.G., et al., 1998 Proc Natl Acad Sci USA 95:548-53, 
1998). 

PIGF can potentiate the action of low concentrations of VEGF in vitro and in vivo 
(Park, J.E., et al., J Biol Chem 269:25646-54, 1994). 
10 VEGFR-1 signaling pathway may regulate normal endothelial cell-cell or cell-matrix 

interactions during vascular development, as suggested by a knockout study (Fong, G.H., et 
al., Nature 376:65-69, 1995). Although VEGFR-1 has a higher affinity to VEGF than 
VEGFR-2, it does not transduce the mitogenic signals of VEGF in ECs (Soker, S., et al., 
Cell 92:735-745, 1998). 

15 VEGFR-2 appears to be the major transducer of VEGF signals in EC that result in 

chemotaxis, mitogenicity and gross morphological changes in target cells (Soker, S., et al., 
Cell 92:735-745, 1998). 

VEGFR-3 has an essential role in the development of the embryonic cardiovascular 
system before the emergence of lymphatic vessels, as shown by a knockout study (Dumont, 
20 D.J., et al., Science 282:946-949, 1998). 

Neuropillin-1 is a receptor for VEGF165. It can enhance the binding of VEGF165 to 
VEGFR-2 and VEGF 165 mediated chemotaxis (Soker, S., et al M Cell 92:735-745, 1998). 

Gene regulation of some of these genes has been investigated and is discussed herein 

below. 

25 In situ hybridization demonstrated VEGF mRNA was present in transplanted tumor 

cells but not in tumor blood vessels, indicating that immunohistochemical labeling of tumor 
vessels with VEGF antibodies reflects uptake of VEGF, not endogenous synthesis. VEGF 
protein staining was evident in adjacent preexisting venules and small veins as early as 5 
hours after tumor transplant and plateaued at maximally intense levels in newly induced 
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tumor vessels by approximately 5 days. In contrast, vessels more than approximately 0.5 
mm distant from tumors were not hyperpermeable and did not exhibit immunohistochemical 
staining for VEGF. Vessel staining disappeared within 24-48 h of tumor rejection. These 
studies indicate that VEGF is synthesized by tumor cells in vivo and accumulates in nearby 
5 blood vessels. Because leaky tumor vessels initiate a cascade of events, which include 
plasma extravasation and which lead ultimately to angiogenesis and tumor stroma 
formation, VEGF plays a pivotal role in promoting tumor growth (Dvorak, H.F., et al., J Exp 
Med 174:1275-8, 1991). In addition, it was shown that stromal cells can be stimulated by 
transplanted tumor cells for VEGF production (Fukumura, D., et al., Cell, 94:715-25, 1998). 
O 10 Fibroblasts cultured in vitro are highly activating for VEGF promoter function compared 
; g with fibroblasts in freshly isolated tumors, indicating the culture condition did not mimic the 

jt: status of normal (unactivated) tissue in vivo (Fukumura, D., et al., Cell, 94:715-25, 1998). 

U For example, C6 tumor spheroids (C6 is a cell line derived from a rat glial tumor - C6 cells 

m aggregate and form small spheroids in culture) implanted into nude mice became 

£ 15 neovascularized accompanied by a gradual reduction of VEGF expression (Shweiki, D., et 
fc al., Proc Natl Acad Sci 92:768-772, 1995). The VEGF promoter region bears many of the 

t characteristics of a house-keeping gene (Tischer, E., JBC 266:1 1947-1 1954, 1991), hence it 

tff is likely that almost any cell type could serve as a source for VEGF upon hypoxic or 

ischemic demand (Fukumura, D., et al, Cell, 94:715-25, 1998). 
20 VEGF expression was upregulated by hypoxia (Shweiki, D., et al., Nature 359:843- 

5, 1992), due to both increased transcriptional activation and stability of its mRNA (flceda, 
E., et al., JBC 270:19761-5, 1995). In a number of in vitro studies, it was shown that 
hypoxia upregulates VEGF expression through the activation of PI3K/Akt pathway 
(Mazure, N.M., et al., Blood 90:3322-31, 1997) and HIF-1 (an enhancer induced by hypoxia 
25 and bind to VEGF promoter region) (Forsythe, J.A., MCB 16:4604-13, 1996; Mazure, N.M., 
et al., Blood 90:3322-31, 1997). VEGF is also upregulated by overexpression of v-Src 
oncogene (Mukhopadhyay, D., Cancer Res. 15:6161-5, 1995), c-SRC (Mukhopadhyay, D., 
et al., Nature 375:577-81, 1995), and mutant ras oncogene (Plate, K.H., Nature 359:845-8, 
1992). The tumor suppressor p53 downregulates VEGF expression (Mukhopadhyay, D., 
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Cancer Res. 15:6161-5, 1995). A number of cytokines and growth factors, including PGF, 
TPA (Grugel, S., et al., JBC 270:25915-9, 1995), EGF, TGF-b, IL-1, and IL-6 induce VEGF 
mRNA expression in certain type of cells (Ferrara, N., et al., Endocr Rev 18:4-25, 1997). 
Kaposi's sarcoma-associated herpesvirus (KSHV), which encodes a G-protein-coupled 
5 receptor - a homolog of IL-8 receptor, can activate JNK/SAPK and p38MAPK and increase 
VEGF production, thus causing cell transformation and tumorigenicity. (Bais, C, Nature 
391:86-9, 1998). 

The growth of androgen-dependent Shionogi carcinoma in immunodeficient mice 
was regressed after the mice were castrated, accompanied by decrease in VEGF expression. 
10 Two weeks after castration, a second wave of angiogenesis and tumor growth begins with a 
concomitant increase in VEGF expression. (Jain, R.K., Proc Natl Acad Sci USA 95:10820- 
5, 1998). 

VEGF-D is induced by transcription factor c-fos in mouse (Orlandini, M. Proc Natl 
Acad Sci 93:11675-80, 1996). 

15 Overexpression of some of these genes has been evaluated using different systems. 

VEGF overexpression in skin of transgenic mice induces angiogenesis, 
vascularhyperpermeability and accelerated tumor development (Larcher, F., et al., Oncogene 
17:303-1 1, 1998). Retina tissue-specific VEGF overexpression in transgenic mice cause 
intraretinal and subretinal neovascularization (Okamoto, N., et aL, Am J Pathol 151:281-91, 

20 1997). VEGF overexpression mediated by the Tet system promotes tumorigenesis of C6 
glioma cells when transplanted into nude mice. The tumors become hypervascularized with 
abnormally large vessels, arising from excessive fusions. The tumors were less necrotic. 
After VEGF expression was shut off, regression of the tumors occurred due to detachment 
of endothelial cells from the walls of preformed vessels and their subsequent apoptosis. 

25 Vascular collapse further lead to hemorrhages and extensive tumor necrosis (Benjamin, 
L.E., et al., Proc Natl Acad Sci 94:8761-66, 1997). In human-VEGF-promoter-GFP 
transgenic mice, implantation of solid tumor induces specific GFP expression in stromal 
cells. Transgenic mice were mated with T-antigen mice (able to form spontaneous mammary 
tumors) to generate double transgenic mice, in which spontaneous mammary tumors were 
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formed. Strong stromal, but not tumor, expression of GFP was observed (Fukumura, D., et 
aL, Cell, 94:715-25, 1998). A CCD camera was used to monitor GFP expression. GFP half 
life was shown to be between about 1.2-1.5 days (Fukumura, D., et al., Cell, 94:715-25, 
1998). The transgene was integrated into the IgG locus of the chromosome through DNA 
5 recombination (Fukumura, D., et al., Cell, 94:715-25, 1998). FVB derived VEGF-GFP 
transgenic mice were mated with wild-type C3H mice to create hybrid mice that can be 
served as hosts for C3H derived tumor lines (Fukumura, D., et al., Cell, 94:715-25, 1998). 

VEGF-C overexpression in the skin of transgenic mice resulted in lymphatic, but not 
vascular, endothelial proliferation and vessel enlargement (Jeltsch, M., et al., Science 
10 276:1423-5, 1997). 

Neuropillin-1 overexpression in transgenic mice resulted in embryonic lethality. The 
embryos possessed excess capillaries and blood vessels. Dilated vessels and hemorrhage 
were also observed (Kitsukawa, T., et al., Development 121:4309-18, 1995). 

The functions of some of these genes have been evaluated in knock-out mice 
15 constructs, animal studies, and in vitro studies. 

A VEGF knockout was an embryonic lethal. Fl is also embryonic lethal and 
angiogenesis was impaired. VEGF secretion from +/- ES cells was reduced to 50% 
(Carmellet, P., et al., Nature 380:435-439, 1996; Ferrara, N., et al., Nature 380:439-442, 
1996). 

20 VEGFR-1 was evaluated in a lacZ knock-in wherein a fragment of the exon that 

contains ATG start codon was replaced by LacZ. Knockout mice were embryonic lethal. 
Blood vessels were formed, but the organization of the blood vessel was perturbed (Fong, 
G.H., et aL, Nature 376:65-69, 1995). 

VEGFR-2 was an embryonic lethal caused by defective endothelial cell development 
25 (Shalaby, R, et al., Nature 376:62-65, 1995). 

VEGFR-3(LacZ Knock-in) was an embryonic lethal caused by defective blood 
vessel development (Dumont, D.J., et al., Science 282:946-949, 1998). 

Neuropillin-1 was an embryonic lethal (Dumont, D.J., et aL, Science 282:946-949, 

1998). 
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In vitro studies showed that a mutant VEGF (a heterodimer of two mutant VEGF) 
(Siemeister, G., et al, Proc Natl Acad Sci 95:4625-9, 1998), as well as a GST-Exon7 
(VEGF) fusion protein (Soker, S., et al, JBC 272:31582-88, 1997), was able to inhibit 
endothelial cell proliferation by acting as an VEGF antagonist and interfering VEGF binding 

5 to VEGFR-2 and VEGFR-1 (Siemeister, G., et al, Proc Natl Acad Sci 95:4625-9, 1998). 
More importantly, A VEGF neutralizing chimeric protein, containing the extracellular 
domain of VEGF receptor (either VEGFR-1 or VEGFR-2) fused with IgG, substantially 
reduced the development of retinal neovascularization when injected into mice with 
ischemic retinal disease (Aiello, L.P., et al, Proc Natl Acad Sci 92:10457-61, 1995). 

10 Treatment of tumors with monoclonal antibodies directed against VEGF resulted in 

dramatic reduction in tumor mass due to the suppression of tumor angiogenesis (Kim, K.J., 
et al., Nature 362:841-44, 1993). Injection of antibodies against VEGF reduced tumor 
vascular permeability and vessel diameter in immunodeficient mice transplanted with 
human glioblastoma, colon adenocarcinoma, and melanoma (Yuan, F., et al., Proc Natl Acad 

15 Sci 93: 14765-70, 1996). Retrovirus mediated overexpression of a dominant negative form 
of VEGFR-2 in nude mice suppresses the growth of transplanted rat C6 glioma tumor cells 
(Millauer, B., et al., Nature 367:576-9, 1994) mammary, ovarian tumors and lung carcinoma 
(Millauer, B., et al., Cancer Res 56:1615-20, 1996). 

20 Mouse VEGFR-2 Promoter 

The subject nucleic acids of the present invention (e.g., as described in Example 3) 
find a wide variety of applications including use as hybridization probes, PCR primers, 
expression constructs useful for compound screening, detecting the presence of VEGFR-2 
genes or varients thereof, detecting the presence of gene transcripts, detecting or amplifying 
25 nucleic acids encoding additional VEGFR-2 promoter sequences or homologs thereof (as 
well as, structural analogs), and in a variety of screening assays. 

The present invention provides efficient methods of identifying pharmacological 
agents or lead compounds for agents active at the level of VEGFR-2 gene transcription. A 
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wide variety of assays for transcriptional regulators can be used based on the teaching of the 
present specification, including, but not limited to, cell-based transcription 
assays, screening in vivo in transgenic animals, and promoter-protein binding assays. For 
example, the disclosed luciferase reporter constructs are used to transfect cells for cell-based 
5 transcription assays. For example, primary endothelial cells are plated onto 
microliter plates and used to screen libraries of candidate agents for lead 
compounds which modulate the transcriptional regulation of the VEGFR-2 gene 
promoter, as monitored by luciferase expression (Example 5). 

As noted above, the present invention relates to a recombinant nucleic acid 

1 0 molecule comprising the promoter region of a mouse VEGFR-2 gene. This invention 
provides a nucleic acid molecule having a sequence selected, for example, from the 
following groups: (a) a nucleic acid sequence of greater than 80% identity to that of SEQ ID 
NO:32, or a fragment thereof, exhibiting promoter activity, in particular VEGFR-2 promoter 
activity; (b) a nucleic acid sequence substantially complementary to said nucleic acid 

15 sequence of (a), or a fragment thereof; and (c) a nucleic acid sequence that specifically 
hybridizes to said nucleic acid sequences of (a) or (b) or fragments thereof. 

The invention includes further VEGFR-2 promoter sequences identified based on the 
teachings of the present specification (including, but not limited to, sequence information 
and isolation methods, e.g., Example 3). 

20 This invention also provides novel deletion constructs of the VEGFR-2 

promoter which either increase or decrease promoter activity beyond that of the 
naturally occurring promoter. Such constructs may provide greater sensitivity than the native 
promoter when used to screen for compounds which affect VEGFR-2 promoter activity. 
The nucleic acid molecules of this invention are useful in effecting tissue 

25 specific expression in endothelial cells, as well as, for screening for compounds that 
selectively modulate transcription in endothelial cells and compounds that modulate 
angiogenic processes. 

Those skilled in the art can practice the invention by following the guidance of the 
specification supplemented with standard procedures of molecular biology for the isolation 
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and characterization of the VEGFR-2 promoters, their transfection into host cells, and 
vascular endothelial cell-specific expression of heterologous DNA operably linked to said 
VEGFR-2 promoters. For example, DNA is commonly transferred or introduced into 
recipient mammal cells by calcium phosphate-mediated gene transfer, electroporation, 

5 lipofection, viral infection, and the like. General methods and vectors for gene transfer and 
expression may be found, for example, in M. Kriegler, Gene Transfer and Expression: A 
Laboratory Manual, Stockton Press (1990). Direct gene transfer to 
cells in vivo can be achieved, for example, by the use of modified viral vectors, including, 
but not limited to, retroviruses, adenoviruses, adeno-associated viruses and herpes viruses, 

10 liposomes, and direct injection of DNA into certain cell types. In this manner, 

recombinant expression vectors and recombinant cells containing the novel VEGFR-2 
promoters of the present invention operably linked to desired heterologous gene 
can be delivered to specific target cells in vivo. See, e.g., Wilson, Nature, 
365: 691-692 (1993); Plautz et al, Annals NY Acad. Sci., 716: 144-153 (1994); 

15 Farhood et al, Annals NY Acad. Sci., 716: 23-34 (1994) and Hyde et al Nature, 

362: 250-255 (1993). Furthermore, cells may be transformed ex vivo and introduced directly 
at localized sites by injection, e.g., intra-articular, intracutaneous, intramuscular and the like. 

Cloning and characterization of the VEGFR-2 promoter are described in Examples 3 
and 5, below. 

20 

Compound/Drug Screening 

Another aspect of this invention is its use in screening for pharmacologically 
active agents (or compounds) that modulate VEGFR-2 receptor promoter activity either by 
affecting signal transduction pathways that necessarily precede transcription or 
25 by directly affecting transcription of the VEGFR-2 gene. 

For screening purposes an appropriate host cell, preferably an endothelial cell, 
more preferably a vascular endothelial cell, is transformed with an expression 
vector comprising a reporter gene (e.g., luciferase) operably linked to the VEGFR-2 gene 
promoter of this invention. The transformed host cell is exposed to various test substances 
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and then analyzed for expression of the reporter gene. This expression can be compared to 
expression from cells that were not exposed to the test substance. A compound which 
increases the promoter activity of the VEGFR-2 promoter will result in increased reporter 
gene expression relative to the control. Similarly, compounds which act as antagonists for 
5 the VEGFR-2 promoter signalling pathway will result in decreased reporter gene expression 
relative to the control. 

Thus, one aspect of the invention is to screen for test compounds that regulate the 
activity of the VEGFR-2 promoter by, for example, (i) contacting a host cell in which the 
VEGFR-2 promoter disclosed herein is operably linked to a reporter gene with a test 

10 medium containing the test compound under conditions which allow for expression of the 
reporter gene; (ii) measuring the expression of the reporter gene in the presence of the test 
medium; (iii) contacting the host cell with a control medium which does not contain the test 
compound but is otherwise essentially identical to the test medium in (i), under conditions 
essentially identical to those used in (i); (iv) measuring the expression of reporter gene in the 

15 presence of the control medium; and (v) relating the difference in expression between (ii) 
and (iv) to the ability of the test compound to affect the activity of the VEGFR-2 promoter. 

Alternatively, the transformed cells may be induced with a transcriptional inducer, 
such as IL-1 or TNF-alpha, forskolin, dibutyryl-cAMP, or a phorbol-type tumor promoter, 
e.g., PMA. Transcriptional activity is measured in the presence or absence of a 

20 pharmacologic agent of known activity (e.g., a standard compound) or putative activity (e.g., 
a test compound). A change in the level of expression of the reporter gene in the presence of 
the test compound is compared to that effected by the standard compound. In this way, the 
ability of a test compound to affect VEGFR-2 transcription and the relative 
potencies of the test and standard compounds can be determined. 

25 Thus in a further aspect, the present invention provides methods of measuring the 

ability of a test compound to modulate VEGFR-2 transcription by: (i) contacting a host cell 
in which the VEGFR-2 promoter, disclosed herein, is operably linked to a reporter gene with 
an inducer of VEGFR-2 promoter activity under conditions which allow for expression of 
the reporter gene; (ii) measuring the expression of the reporter gene in the absence of the test 
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compound; (iii) exposing the host cells to the test compound either prior to, simultaneously 
with, or after contacting, the host cells with the inducer; (iv) measuring the expression of the 
reporter gene in the presence of the test 

compound; and (iv) relating the difference in expression between (ii) and (iv) to the ability 

5 of the test compound to modulate VEGFR-2-mediated transcription. 

Because different inducers are known to affect different modes of signal 
transduction, it is possible to identify, with greater specificity, compounds that affect a 
particular signal transduction pathway. Further, because the VEGF receptors have been 
shown to be upregulated in tumor cells and this upregulation appears to be necessary for 

10 tumor angiogenesis, such assays provide a means of identifying compounds that will inhibit 
and/or reverse tumor growth by downregulating VEGFR-2 expression and thus preventing 
or reducing tumor angiogenesis. 

A variety of reporter genes may be used in the practice of the present invention. Preferred 
are those that produce a protein product which is easily measured in a routine assay. Suitable 
15 reporter genes include, but are not limited to chloramphenicol acetyl transferase (CAT), 
light generating proteins (e.g., luciferase), and beta-galactosidase. Convenient assays 
include, but are not limited to calorimetric, fluorimetric and enzymatic assays. In one aspect, 
reporter genes may be employed that are expressed within the cell and whose extracellular 
products are directly measured in the intracellular medium, or in an extract of the 
20 intracellular medium of a cultured cell line. This provides advantages over using a reporter 
gene whose product is secreted, since the rate and efficiency of the secretion introduces 
additional variables which may complicate interpretation of the assay. In a preferred 
embodiment, the reporter gene is a light generating protein. When using the light generating 
reporter proteins described herein, expression can be evaluated accurately and non- 
25 invasively as described above (see, for example, Contag, P. R., et al., (1998) Nature Med. 
4:245-7; Contag, C. H., et al., (1997) Photochem Photobiol. 66:523-31; Contag, C. H., et al, 
(1995) Mol Microbiol. 18:593-603). 

In another aspect of this invention, transgenic animals expressing a 
heterologous gene encoding a detectable product under the regulatory control of 
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the VEGFR-2 promoter, as disclosed herein, may be used to determine the effect 

of a test compound on the stimulation or inhibition of the VEGFR-2 promoter in vivo. The 

test compound is, for example, administered to the animal and the degree of 

expression of the heterologous gene observed is compared to the degree of 

expression in the absence of administration of the test compound using, for example, whole 

animal luciferase-based assays as disclosed herein. Methods of generating transgenic 

animals were described above. 

This invention also provides transgenic animals useful as disease models for 
studying VEGFR-2 function and endothelial cell-specific gene expression. 

Various forms of the different embodiments of the invention, described herein, may 
be combined. 

The following examples are intended only to illustrate the present invention and 
should in no way be construed as limiting the subject invention. 

EXAMPLES 

Example 1 

Generating the Targeting Cassette and Vector 
A. Creation of the Backbone Vector 

pTK53: The 0.5 kb mouse phosphoglycerate kinase 1 promoter was amplified with 
PGK primers (PGKF, SEQIDNO:l: ATCGAATTCTACCGGGTAGGGGAGGCGCTTT; 
PGKR, SEQ ID NO:2: GGCTGCAGGTCGAAAGGCCCGGAGATGAGG) using mouse 
genomic DNA (Genome Systems, Inc., St. Louis, MO) as template. This fragment was then 
double digested with EcoRI and PstI and cloned into the pKS vector (Stratagene, La Jolla, 
California) which was linearized with the same enzymes. The neomycin gene was amplified 
with NeoF (SEQ ID NO:3: ACCTGCAGCCAATATGGGATCGGCCATTGAAC) and 
NeoR (SEQ ID NO:4: GGATCCGCGGCCGCCCCCAGCTGGTTCTTTCCGCCTC) 
primers using pNTKV1907 (Stratagene) as a template. The 1.1 kb PCR fragment was double 
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digested with PstI and BamHI and cloned into the pKS-PGK vector which was linearized 
with the same enzymes. This pKS-PGK-Neo vector was used to clone thymidine kinase 
gene as follows. Primers TKF (SEQ ID NO: 5: 

GGATCCTCTAGAGTCGAGCAGTGTGGTTTT) and TKR (SEQ ID NO:6: 
5 GAGCTCCCGTAGTCAGGTTTAGTTCGTCCG) were used to amplify the TK gene from 
pNTKV1907 (Stratagene). The amplified 2kb fragment was then digested with BamHI and 
SacI and cloned into pKS-PGK-Neo vector that was linearized with the same enzymes. This 
constructed vector was designated as pTK. A synthetic linker F5R5 was made after 
annealing of two primers (forward primer, SEQ ID NO: 7: 
10 GTACATTTAAATCCTGCAGG, reverse primers, SEQ ID NO:8: 

AGCTCCTGCAGGATTTAAAT). This linker was inserted between Asp718I and Hindlll 
sites of pTK and the new construct was designated pTK5. A second synthetic linker F3R3 
was made by annealing of two primers (F3R31 forward primer, SEQ ID NO:9: 
GGCCCGGGCTTAATTAATGCATCATATGGTACCGTTTAAACGCGGCCGCAAGCT 
15 TGTCGACGGCGCGCCGGCCGGCC, F3R32reverse primer, SEQ ID NO: 10: 

GATCGGCCGGCCGGCGCGCCGTCGACAAGCTTGCGGCCGCGTTTAAACGGTACC 
AT ATG ATGC ATT A ATT A AGCCCG) . ). This linker was inserted between NotI and 
BamHI sites of pTK and the new construct was designated pTK53. Schematics of the 
vectors are shown are shown in Figure 1. 

20 

B. Introduction of Lucif erase 

pTK-LucYG and pTK-LucR: The yellow green luciferase gene was isolated from 
pGL3 vector (Promega) as a Hindlll-Sall fragment and was cloned into pGK53 that was 
linearized with the same enzymes. The new construct was designated pTK-LucYG (893 1 
25 bp), shown in Figure 2. 

The red luciferase gene was isolated from pGL3-red vector (Dr. Christopher Contag, 
Stanford University, Stanford, Calif.) as a Hindlll-Sall fragment and was cloned into pGK53 
that was linearized with the same enzymes. The new construct was designated pTK-LucR 
(8931 bp), shown in Figure 2. 



30 
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Example 2 
Insertion of Targeting Sequences 

A. Generation of vitronectin targeting vector: The targeting construct pTKLR-Vn 
5 was generated by inserting vitronectin (VN) DNA sequences into pTK-LucR vector. 

Vitronectin (VN) is an abundant glycoprotein present in plasma and the extracellular 
matrix of most tissues. In a previous study, it was shown that heterozygous mice carrying 
one normal and one null VN allele and homozygous null mice completely deficient in 
vitronectin demonstrate normal development, fertility, and survival. This suggests that VN is 

10 not essential for cell adhesion and migration during normal mouse development (Zheng, X., 
et al, Proc Natl Acad Sci U S A 1995 92:12426-30). Mouse vitronectin genomic DNA 
sequence of 5004 bp was obtained from GenBank database (Accession number X72091). 
Based on this sequence, a 1.63 kb 3 'end vitronectin fragment was amplified (reverse primer 
VN1R, SEQ ID NO: 1 1 : CTGTATTTAAATCTGCCCACCCTATTCAGGACAGTAGTC; 

15 forward primer VN1F, SEQIDNO:12: 

CC AATGCATCAACCCAGCCAGGAGGAGTGCG) using mouse C57BL/6 genomic DNA 
as template (Genome Systems, Inc., St. Louis, MO). This fragment was digested with Swal 
and Nsil and cloned into pTK-LucR (linearized with Swal and SbjT). This construct was 
designated as pTK-LucR3. Subsequendy, a 2.35 kb 5 ? end vitronectin fragment was 

20 amplified (reverse primer VN2R, SEQ ID NO: 1 3 : 

AACGCGTCGACTTCGGAGATGTTTCGGGGATAACCAGG, forward primer VN2F, 
SEQ ID NO: 14: TTGGCGCGCCCCATAGAGAAGAGACACCAAAGGCACGCTC) using 
mouse C57BL/6 genomic DNA as template. This fragment was digested with Sail andAsd 
and cloned into pTK-LucR vector that was linearized with Sail and AscI. This construct was 

25 designated as pTKLR-Vn. Figure 3A shows the restriction map of pTKLR-Vn vector. The 
polylinker between the neomycin gene and red luciferase gene is used to insert the VEGF 
promoter or other promoters of interests. The predicted homologous recombination between 
pTKLR-Vn and vitronectin gene is illustrated in Figure 3B. Upon insertion of the VEGF- 
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LucR transgene cassette, the endogenous vitronectin gene is destroyed. Figure 3C shows the 
genomic DNA sequence of VN. 

B. Generation of Fos targeting vector: The targeting construct pTKLG-Fos was 
5 generated by inserting FosB DNA sequences into pTK-Luc YG vector. 

FosB is one of the members of the Fos family. It plays a functional role in 
transcriptional regulation. It has been shown that FosB mice are born at a normal frequency, 
are fertile and present no obvious phenotypic or histologic abnormalities (Gruda et al (1996) 
Oncogene 12:2177-2185). A 28.8 kb genomic region that contains mouse FosB DNA 
10 sequence was obtained from GenBank database (Accession number AF093624). 

Using this sequence, a 1.71 kb 5'end FosB fragment was amplified (forward primer 
FosB IF, SEQ ID NO: 15: CTGTATTTAAATCCCGTTTCTCACTGTGCCTGTGTC; 
reverse primer FosB 1R, SEQ ID NO: 16: 

GTCTCCTGCAGGCTTCCTCCTCCTTGTTCCTTGCG) using mouse C57BL/6 genomic 
15 DNA as template. This fragment was digested with Swal and Sbfl and cloned into pTK- 
LucYG vector that was linearized with Swal and Sbfl. This construct was designated as 
pTK-LucYG3. Subsequently, a 1.58 kb 3 'end FosB fragment was amplified (forward 
primer FosB2F, SEQ ID NO: 17: 

AACGCGTCGACGGATGGGATTGACCCCCAGCCCTC; reverse primerFosB2R , SEQ 
20 ID NO: 18: TTGGCGCGCCCCTTGCCTCCACCTCTCAAATGC) using mouse C57BL/6 
genomic DNA as template. This fragment was digested with Sail and AscI and cloned into 
pTK-LucYG vector that was linearized with Sail and AscI. This construct was designated as 
pTKLG-Fos (Figure 4A). The polylinker between the neomycin gene and red lucif erase 
gene is used to insert the VEGFR2 promoter (Example 3, Figures 5A-C, enhancer Figure 
25 11), Tie2 promoter (Example 3, Figure 15, enhancer Figure 16), as well as, other promoters 
of interests. The predicted homologous recombination between the targeting vector bearing 
the VEGFR2 promoter (Figure 14) or the Tie2 promoter (Figure 19) and FosB gene is also 
illustrated. As shown in the Figures, the VEGFR2-LucYG transgene cassette and Tie2- 
Luc YG transgene cassette is inserted downstream of FosB gene translational stop signal. 
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Therefore, the targeted transgenic mice should still have a functional FosB gene while 
expressing the transgenes. Figure 4B showns the DNA sequence of FosB. 

Example 3 

Insertion of Promoter Sequences of Interest 
A. pTKLR-Vn/VEGF: Mouse VEGF genomic DNA sequence of 2240 bp that 
contains a partial VEGF promoter region was obtained from GenBank (accession number: 
U41383). Accordingly, primers were designed to amplify a 0.69 kb (VF1-VR1A; Table 1) 
and a 0.98 kb fragment (VF2-VR2; Table 1). It was confirmed that each pair of primers can 
amplify the predicted product using mouse 129SvJ genomic DNA as template. 



Table 1 



Name 


SEQID 
NO: 


Sequence 


VF1 


19 


ACCTC ACTCT CCTGT CTCCC CTGAT TCCCA A 


VR1A 


20 


GCTCT GGCGG TCACC CCCAA AAGCA 


VF2 


21 


CCCTT TCCAA GACCC GTGCC ATTTG AGC 


VR2 


22 


ACTTT GCCCC TGTCC CTCTC TCTGT TCGC 


KF1 


23 


GCTGC GTCCA GATTT GCTCT CAGAT GCG 


KR1 


24 


TTCTC AGGCA CAGAC TCCTT CTCCG TCCCT 


KF2 


25 


CAGAT GGACG AGAAA ACAGT AGAGG CGTTG GC 


KR2 


26 


GAGGA CTCAG GGCAG AAAGA GAGCG 


TF3 


27 


AGCTT AGCCT GCAAG GGTGG TCCTC ATCG 


TF2 


28 


CAAAT GCACC CCAGA GAACA GCTTA GCCTG C 


TR1 


29 


GCTTT CAACA ACTCA CAACT TTGCG ACTTC CCG 



Conditions for PCR amplification are shown in Figure 6. These primers were used 
for PCR screening of mouse 129/SvJ genomic DNA BAC (bacterial artificial chromosome) 
library (Genome Systems, Inc., St. Louis, MO). The library, on average, contained inserts 
of 120 kb with sizes ranging between 50 kb to 240 kb. A large genomic DNA fragment that 
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contained VEGF promoter region was obtained. Southern blot analysis was performed to 
map the VEGF promoter region. A unique HindM restriction site was mapped 
approximately 7.8 kb upstream of the ATG translational start codon of the VEGF gene. The 
sequences between Hindm and ATG translational start codon are inserted into the 
5 polylinker of pTKLR-Vn vector to finish the construction of targeting vector that contains 
VEGF-LucR transgene (Figure 3A). 



B. VEGFR2 targeting vector pTKLG-Fos-KPN 

10 1. Cloning of VEGFR2 promoter 

5 Mouse VEGFR2 genomic DNA sequence of 1079 bp that contains partial VEGFR2 

j|J promoter region was published previously (Ronicke et al (1996) Cir. Res. 79:277-285). 

tP Accordingly, primers that were able to amplify a 0.45 kb (KFI-KR1 ; Table 1) and a 0.58 kb 

Q fragment (KF2-KR2; Table 1) were designed. It was confirmed that each pair of primers 

^ 15 can amplify the predicted product using mouse 129SvJ genomic DNA as template. DNA 

|^ sequences for these primers are shown in Table 1 above. PCR amplification conditions are 

,2 shown in Figure 6. These primers were used for PCR screening of mouse 129/SvJ genomic 

.lass; 

^; DNA B AC library. A large genomic DNA fragment of VEGFR2 promoter region was 

yij obtained. Based on the VEGFR2 restriction map that was published (Ronicke et al, supra), 

20 a 4.6 kb HindlH-Xbal fragment that covers the VEGFR2 promoter region was subcloned 

from the VEGFR2 BAC clone into the pSK vector (Stratagene, La Jolla, CA) and linearized 

with Hindlll and Xbal. This construct was designated pSK-K6. 

2. Engineering of the VEGFR2 promoter 

25 PSK-K6 was engineered to deleted a 159 bp sequence of the 3 fend promoter region 

spaning from ATG translational start codon to an Xbal site. A 0.3 kb 3 'end fragment was 
amplified by PCR (Forward primer (VR2F): CGCTAGTGTGTAGCCGGCGCTCTC (SEQ 
ID NO:30); reverse primer (VR2R): 

ATAAGAATGCGGCCGCCTGCACCTCGCGCTGGGCACAG (SEQ ID NO:31)) and 
30 digested with Bsu361 and Notl. This fragment was used to replace the 0.45 kb Bsu36I-Notl 
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fragment of pSK-K6 and the resulting construct was designated PSK-KP, which contains 
VEGFR2 promter sequences of 4.5 kb, spanning from a Hindin site to the ATG 
translational start codon. The 4.5 kb VEGFR2 promoter was fully sequenced and the 
sequence is shown in Figures 5A-C (SEQ ID NO: 32). The present invention includes, but is 
5 not limited to, an isolated polynucleotide having at least 90%, preferably 92%, more 

preferably 95%, and even more preferably 98% sequence identity to the sequence presented 
asSEQIDNO:32. 

3. Cloning of VEGFR2 enhancer 

10 In a recent report, it was described that a 5 1 1 bp sequence within the first intron of 

VEGFR2 gene functions as an endothelial cell specific enhancer. (Kappel et al (1999) Blood 
12: 4284-4292). Accordingly, this VEGFR2 enhancer sequence was amplified by PCR 
using VEGFR2 BAC clone DNA. (Forward primer (VEF): 

ACACGCCTCGAGAAATGTGCTGTCTTTAGAAGCCACTG (SEQ ID NO:33); Reverse 
15 primer (VER): ACACGCGTCGACGATCCAATAGGAAAGCCCTTCCATAAAC (SEQ 
ID NO:34)).This fragment was digested with Xhol and Sail and cloned into the Sail site of 
the pSK vector. The resulting construct was designated PSK-KN. The 51 1 bp VEGFR2 
enhancer is shown in Figure 1 1 (SEQ ID NO:35). ). 

20 4. pGL3B2 

The yellow-green luciferase containing vector pGL3B (Promega, Madison, WI) was 
re-engineered as illustrated in Figure 12. First, pGL3B was digested with NotI and then 
blunt ended with T4 DNA polymerase. A Pmel linker (New England Biolab) was then 
ligated into the vector. The new vector, pGL3B-Pme, was double digested with Asp718 and 
25 HindHI and ligated with a synthetic linker resulted from annealing of two complementary 
oligos. (GL3B-Forward GTACTTAATTAAGCTTGGTACCCGGGGCGGCCGC (SEQ ID 
NO:36); GL3B-Reverse AGCTGCGGCCGCCCCGGGTACCAAGCTTAATTAA (SEQ ID 
NO:37)). The new vector was designated pGL3B2. 
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Construction of pGL3B2-KPN 

As illustrated in Figure 12, the VEGFR2 promoter is isolated from pSK-KP as a 
HindlH-Notl fragment and cloned into the pGL3B2 vector that is linearized with HindlH 
and Notl. The new construct was designated pGL3B2-KP. Subsequently, the VEGFR2 
5 enhancer was isolated from pSK-KN as a XhoI-SaE fragment and cloned into the pGL3B2- 
KP vector that was linearized with Sail. The new construct was designated pGL3B2-KPN. 

6. Construction of pTKLG-Fos-KPN 

The VEGFR2 promoter-luciferase-enhancer cassette was isolated from pGL3B2- 
10 KPN as a Pacl-Sall fragment and cloned into the pTKLG-Fos vector that was linearized with 
Paci and Sail The new construct was designated pTKLG-Fos-KPN. (Fig 13), Using this 
targeting construct, the VEGFR2 promoter-Luciferase-enhancer transgene cassette is 
targeted to the FosB gene locus through homologous DNA recombination, as illustrated in 
Fig 14. 

15 

C. Tie2 targeting vector pTKLG-Fos-TPN 
1. Cloning of Tie2 promoter 

A 477 bp region of the mouse Tie2 promoter has been isolated and sequenced. (Fadel 
20 et al (1998) Biochem J. 330:335-343). Using this region, primers that were able to amplify a 
0.45 kb (TF3-TR1; Table 1) and a 0.47 kb fragment (TF2-TR1; Table 1) were designed. It 
was confirmed that each pair of primers amplified the predicted product using mouse 129SvJ 
genomic DNA as template. DNA sequences for these primers are shown in Table 1 above 
and PCR amplification conditions are shown in Fig 6. These primers were used for PCR 
25 screening of mouse 129/SvJ genomic DNA BAC library. A large genomic DNA fragment 
of Tie2 promoter region was obtained. Based on the Tie2 genomic DNA restriction map 
that was published (Dumont et al (1994) Genes and Development 8:1897-1909), a 10.5 kb 
Asp718-EcoRV fragment spanning the Tie2 promoter region was subcloned from the Tie2 
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BAC clone into the pSK vector linearlized with Asp718 and EcoRV. The new construct 
was designated pSK-T67. 

2. Engineering of the Tie2 promoter 

5 pSK-T67 was further engineered to delete all the 3.4 kb sequence spaning from ATG 

translational start codon to EcoRV site. A 1.0 kb 3'end promoter region was amplified by 
PCR (T2 Forward primer: TATCAACACTCGGGAGGCTGAGGGAG (SEQ ID NO:38); 
T2 reverse primer: ATAAGAATGCGGCCGCACTTCCCCAGATCTCCCCATCCAGC 
(SEQ ID NO:39)) and digested with BstAPI and Notl. The 0.55 kb BstAPI-Notl fragment 

10 was used to replace the 4.0 kb BstAPI-Notl fragment of pSK-T67 and the resulting construct 
was designated PSK-TP, which contains Tie2 promter sequences of 7.1 kb, spanning from a 
Asp718 site to the ATG translational start codon. The 7.1 kb Tie2 promoter was fully 
sequenced and the sequence was shown in Figure 15 (SEQ ID NO:40). ). The present 
invention includes, but is not limited to, an isolated polynucleotide having at least 90%, 

15 preferably 92%, more preferably 95%, and even more preferably 98% sequence identity to 
the sequence presented as SEQ ID NO.40. 

3. Cloning of Tie2 enhancer 

In a previous report, it was described that a 1.7 kb region within the first intron of the 
20 Tie2 gene functions as an endothelial cell specific enhancer. (Schiaeger et al (1997) PNAS 
USA 94: 3058-3063). Accordingly, this 1.7 kb Tie2 enhancer region was subcloned from 
the Tie2 BAC clone DNA as a XhoI-Asp7 18 fragment into the pSK vector that was 
linearized with the same enzymes. The Asp718 site was then converted to a Sail site using a 
Sail linker (New England Biolab). The resulting construct was designated PSK-TN. The 
25 1 .7 kb Tie2 enhancer was fully sequenced and the sequence is shown in Figure 16 (SEQ ID 
NO:41). ). 



4. Construction of pGL3B2-TPN 
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As illustrated in Figure 17, the Tie2 promoter was isolated from PSK-TP as a 
Asp718-NotI fragment and cloned into the pGL3B2 vector that was linearized with Asp718 
and Notl. The new construct was designated pGL3B2-TP. Subsequently, the Tie2 enhancer 
was isolated from PSK-TN as a Xhol-Sall fragment and cloned into the pGL3B2-TP vector 
linearized with Sail. The new construct was designated pGL3B2-TPN. 

5. Construction of pTKLG-Fos-KPN 

The Tie2 promoter-Luciferase-enhancer cassette is isolated from pGL3B2-TPN as a 
Pacl-Sall fragment and cloned into the pTKLG-Fos vector linearized with Pad and Sail. 
The new construct was designated pTKLG-Fos-TPN. (Figure 18 ). Using this targeting 
construct, the Tie2 promoter-Luciferase-enhancer transgene cassette is targeted to the FosB 
gene locus through homologous DNA recombination, as illustrated in Fig 19. 

Example 4 

Generation of Transgenic Mice Carrying the Constructs of the Present 

Invention 

A. General Procedure: Figure 7 depicts a generalized description of generation of 
transgenic mice using the targeted transgenic vectors described in Example 3. Details 
regarding embryonic stem (ES) cell culture, transfection, blastocyst injection and 
implantation to a pseudopregnant foster are described, for example, in Hogan et al (1994) 
"Manipulating the Mouse Embryo, A Laboratory Manual. Second Edition", Cold Spring 
Harbour Laboratory Press. 

After construction the targeted transgenic construct are transfected into C57BL/6 
embryonic stem (ES) cells. (Genome System Inc., Genome Systems, Inc., St. Louis, MO) 
through electroporation. The antibiotic G418 is used to select for cells in which the DNA 
construct containing the Neo gene is integrated, either randomly or by homologous 
recombination. The nucleoside analog gancyclovir is converted by TK to a cytotoxic 
derivative. DNA that has integrated by homologous recombination lose the TK gene and are 
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resistant to the drug, whereas cells that have incorporated the DNA randomly are likely to 
retain the TK gene. Thus, cells containing random integrations into a chromosomal location 
that allows the expression of the TK gene are killed. The G418 and gancyclovir resistant 
clones are then be screened by PCR and Southern blot analysis and those that have 
5 homologous DNA recombination is used for FVB/N blastocyst injection (Genome System, 
Inc.). Between 4-16 blastocysts are transferred to the uterus of a pseudopregnant foster 
mother. The pups are typically born 17 days after the transfer. Either random bred mice or 
Fl hybrid mice make suitable recipients. Females of certain random-bred stocks (e.g., CD1 
mice, from Charles River Laboratories) have very large ampullae, which makes oviduct 
10 transfer easier. These mice also generally make good mothers. Alternatively, Fl hybrid 

females (e.g., B6 x CBA Fl) can be used as recipients. Although their ampullae are smaller, 
make exceptionally good mothers,rearing litters as small as two pups. See, for example, 
Hogan et al. (1994), supra. 

15 B. Screening for homologous DNA recombination positive ES cells 

1). pTKLG-Fos/VEGFR2: Analysis of homologous DNA recombination between 
pTKLG-Fos/VEGFR2 targeting vector and the FosB gene is carried out using Southern blot 
analysis as shown in Figure 8. Genomic DNA prepared from G418 resistant ES cells is 
digested with PvuII and probed with probe A to confirm the 5' end DNA recombination. 

20 PvuII digestion of DNA bearing homologous recombination reveals two separate bands of 
8.2 and 4.0 kb, whereas digestion of DNA from homologous recombination negative clones 
reveals only the 8.2 kb band. The 3'end of DNA recombination is tested by hybridizing NotI 
digested DNA with probe B. NotI digestion of DNA bearing homologous recombination 
will reveal two separate bands of >8.2kb and 5.0 kb, whereas digestion of DNA from 

25 homologous recombination negative clones will only reveal the >8.2 kb band. Once 
homologous DNA recombination is confirmed, positive clones is selected for FVB/N 
blastocyst injection. 
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2) pTKLG-Fos/Tie2: Analysis of homologous DNA recombination between 
pTKLG-Fos/Tie2 targeting vector and the FosB gene is analyzed by Southern blot in a 
similar manner as described above for pTKLG-Fos/VEGFR2 . Once homologous DNA 
recombination is confirmed, positive clones are selected for FVB/N blastocyst injection. 

3) PTKLR-Vn/VEGF: Analysis of homologous DNA recombination between 
pTKLR-Vn/VEGF targeting vector and the vitronectin gene is analyzed by PCR. DNA 
primers designed according to the predicted homologous recombination, are listed in Table 
2. 

Table 2 

PCR primers for analysis of homologous DNA recombination between pTKLR-Vn/VEGF 

targeting vector and the vitronectin gene 



5'end primers 

F51 5'- CCCAGTGTCTCTGATTTAGGGAGAGCACCTGAG -3' (SEQ ID NO:42) 
R51 5'- CCAGACTGCCTTGGGAAAAGCGCCTC -3' (SEP ID NO:43) 

F52 5'- CAGTGAGAGTCTTCTCTGTCCCTCAATCGGTTCTG -3' 

(SEQ ID NO:44) 

R52 5'- TGGATGTGGAATGTGTGCGAGGCCAG -3' (SEQ ID NO:45) 

3'end primers . 

F31 5'- AATCAAAGAGGCGAACTGTGTGTGAGAGGTCC -3' (SEQ ID NO:46) 
R31 5'- CGGCTCCCCAAAATGTGGAAGCAAGC -3' (SEQ ID NO:47) 

F32 5'- GAATCCATCTTGCTCCAACACCCCAACATC -3' (SEQ ID NO:48) 

R32 5'- CGCCTCCTCTCCCCAGTCTCCCCTTG -3' (SEQ ID NO:49) 

Primers F51-R51 and F52-R52 amplify a 1799 bp and a 1841 bp DNA fragment 
respectively from the 5'end of the transgene that is integrated into the vitronectin site 
through homologous DNA recombination, whereas primers F31-R31 and F32-R32 amplify 
a 3549 bp and a 3428 bp DNA fragment respectively from the 3'end of the transgene that is 
integrated into the vitronectin site through homologous DNA recombination. Clones that 
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allow successful amplification of both the 5 'end and 3 'end of the integrated transgene are 
selected for FVB/N blastocyst injection. 

C. Analysis of chimeric mice 

5 The pups developed from injected blastocysts contain chimeras, as can be identified 

by their agouti coat color when an ES cell derived from a mouse having a dark coat color 
(e.g., C57BL/6) is injected into the blastocyst of a light coat color animal (e.g., FVB/N, 
genotype B/B). DNA analysis (e.g., Southern blotting, PCR) is conducted to further confirm 
the presence of the transgene in these pups as described above in Section B. These animals 
10 may be obtained commerically, for example from The Jackson Laboratory, Bar Harbor, MR 

D. Generating targeted transgenic C57BL/6 mice with white coat color 

Breeding of the chimeric mice generates homozygous targeted transgenic mice, as 
depicted in Figure 9. The targeted mice are used to monitor gene expression through the 

15 measurement of luciferase mediated light emission from the mice. In a preferred 

embodiment, the targeted mouse has a light coat color (e.g., white coat color), because the 
black colored coat (an example of a dark coat color) of C57BL/6 mice can absorb light 
emitted from the body and may interfere the sensitivity of the bioluminescence assay. An 
inbred mouse strain C57BL/6-Tyr C2j/+ strain (Jackson Laboratory, Bar Harbor, MN) is 

20 available for this purpose. This strain of mice have white color coat, yet they still have the 
same genetic background as C57BL/6 mice except that the gene responsible for the black 
coat color is mutated. Unfortunately, C57BL/6-Tyr C2j/+ ES cells are not currently 
available. Therefore, the designed breeding program illustrated in Figure 9 is aimed to 
generate mice that are homozygous for the target transgene and have white coat color. 

25 C57BL/6 ES cells are prepared as described above and introduced into a suitable blastocyst 
{e.g., from the FVB/N strain of mice). The blastocysts are implanted into a foster mother. 
Chimeric mice are shown in Figure 9 as white animals with black and green patches. 
Chimeric animals are bred with C57BL/6-Tyr C2j/+ mice to create Fl hybrids. Subsequent 
breeding of the Fl hybrids generates several type of mice, including the one that is 
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homozygous for the target transgene and has a white coat color (shown in Figure 9 as b/b; 
L/L), which is used for in vivo gene regulation monitoring. 

A C57BL/6 mouse and a C57BL/6-Tyr C2j/+ mouse are considered to be 
substantially isogenic. Accordingly, the method of the present invention exemplified in 
5 Figure 9 provides a means for generating breeding groups of substantially isogenic mice in a 
selected genetic background carrying at least one transgene of interest. 

E* Dual lucif erase targeted transgenic mice 

As described above, two targeting vectors are generated. PTKLR-Vn carries a red 
10 luciferase gene and is targeted into vitronectin locus. PTKLG-Fos carries a yellow-green 
luciferase gene and is targeted into FosB locus. A number of promoters, including VEGF 
promoter, VEGFR2 promoter, and Tie2 promoter are cloned into these vectors, as described 
above. Subsequently three type of targeted transgenic mice are generated. VEGF mice carry 
VEGF promoter-red luciferase transgene (VEGF-LucR) integrated into vitronectin locus. 
15 VEGFR2 mice carry VEGFR2 promoter-yellow-green luciferase (VEGFR2-LucYG) 

transgene integrated into FosB locus. Tie2 mice carry Tie2 promoter-yellow-green luciferase 
(Tie2-LucYG) transgene integrated into FosB locus. Through a breeding program illustrated 
in Figure 10, dual luciferase targeted transgenic mice are produced, carrying both of the 
VEGF-LucR and the VEGFR2-Luc YG transgenes. The degradation of luciferin by yellow- 
20 green luciferase and red luciferase generates lights that emit at 540 nM and 610 nM 

respectively. These wavelengths of light are measured individually using a photo-counting 
camera (intensified CCD). Therefore, both VEGF expression and VEGFR2 expression, for 
example, can then be monitored in the same mouse at the same time. 

25 Example 5 

Modulation of Expression Mediated by VEGFR2 Promoter Sequences 

The 4.5 kb VEGFR2 promoter identified in Example 3 and shown in Figures 5A-C 
(SEQ ID NO:32) was cloned into the polylinker of pGL3B2 (Fig. 12) to control the 
transcription of luciferase coding sequences (Fig. 12). A 0.5 kb VEGFR2 enhancer 
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sequence was cloned down stream of the luciferase to enhance endothelial specific 
expression. The resulting expression construct (pGL3B2-KPN; Fig. 12) was used to 
transiently transfect primary bovine endothelial cells (Clonetics) using lipofectamine 
(Promega). The cells were seeded onto 12-well plastic culturing plates (Nunc) prior to 
5 transfection. The transfection was carried out according to the manufacture's instructions 
(Promega). Plasmid pRL-TK (Promega), containing Renilla luciferase driven by the 
thymidine kinase promoter, was used as an internal control in all transfection experiments. 
The primary bovine endothelial cells were cultured in EGM-2 MV medium (Clonetics) at 
37°C in 5% CO2, 95% air. After transfection, the cells were lysed with passive lysis buffer 
10 (Promega) and assayed with the Dual-Luciferase Reporter Assay System (Promega) for 
luciferase activity. 

Several angiogenesis and neoplasticity inhibitors (Sigma) were tested for their 
effects on the expression of VEGFR2 expression in primary bovine endothelial cells 
transiently transfected with pGL3B2-KPN as described above. Briefly, 24 hrs after 

15 transfection, the cells were treated with selected angiogenesis and neoplasticity inhibitors for 
36 hrs and assayed for luciferase activity. The tested compounds included the neoplasticity 
inhibitor Mithramycin, and angiogenesis inhibitors 2-Methoxyestradiol, Thalidomide, and 
Fumagillin. At least some of the tested compounds had the effect of reducing luciferase 
expression mediated by the 4.5 kb VEGFR2 promoter. 

20 These results suggest that sequences derived from the 4.5 kb VEGFR2 promoter are 

useful for screening for compounds capable of modulating VEGFR2-mediated angiogenesis. 

Example 6 

Modulation of Expression Mediated by Tie2 Promoter Sequences 

25 The 7.1 kb Tie2 promoter identified in Example 3 and shown in Figure 15 (SEQ ID 

NO:40) was cloned into the polylinker of pGL3B2 (Fig. 17) to control the transcription of 
luciferase coding sequences (Fig. 17). The resulting expression construct (pGL3B2-TP; Fig. 
17) was used to transiently transfect primary bovine endothelial cells (Clonetics) using 
lipofectamine (Promega). The cells were seeded onto 12-well plastic culturing plates 
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(Nunc) prior to transection. The transfection was carried out according to the manufacture's 
instructions (Promega). Plasmid pRL-TK (Promega), containing Renilla luciferase driven 
by the thymidine kinase promoter, was used as an internal control in all transfection 
experiments. The primary bovine endothelial cells were cultured in EGM-2 MV medium 

5 (Clonetics) at 37°C in 5% C0 2 , 95% air. After transfection, the cells were lysed with 

passive lysis buffer (Promega) and assayed with the Dual-Luciferase Reporter Assay System 
(Promega) for luciferase activity. 

Several angiogenesis and neoplasticity inhibitors (Sigma) were tested for their 
effects on the expression of Tie2 expression in primary bovine endothelial cells transiently 

10 transfected with pGL3B2-TP as described above. Briefly, 24 hrs after transfection, the cells 
were treated with selected angiogenesis and neoplasticity inhibitors for 36 hrs and assayed 
for luciferase activity. The tested compounds included the neoplasticity inhibitor 
Mithramycin, and angiogenesis inhibitors 2-Methoxyestradiol, Thalidomide, and 
Fumagillin. At least some of the tested compounds had the effect of reducing luciferase 

15 expression mediated by the 7. 1 kb Tie2 promoter. 

These results suggest that sequences derived from the 7.1 kb Tie2 promoter are 
useful for screening for compounds capable of modulating Tie2-mediated angiogenesis. 

As is apparent to one of skill in the art, various modification and variations of the 
20 above embodiments can be made without departing from the spirit and scope of this 
invention. These modifications and variations are within the scope of this invention. 
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CLAIMS 

What is claimed is: 

L A method for identifying an agent which modulates the association of a 
transcription regulator and a transcription factor, the method comprising: 

combining (i) a regulator of gene expression comprising a polynucleotide sequence 
derived from SEQ ID NO:32 wherein said polynucleotide sequence has cis transcriptional 
regulatory activity, (ii) at least one transcription factor having trans transcriptional 
regulatory activity which interacts with said regulator, and (iii) a candidate agent, said 
combining performed under conditions wherein, but for the presence of the candidate agent, 
the regulator and the transcription factor form a first association; and 

detecting the presence of a second association of the regulator and the transcription 
factor, wherein a difference between the first and the second association indicates that the 
candidate agent is an agent that modulates the association of the transcription regulator and 
the transcription factor. 

2. The method of claim 1, wherein said regulator of gene expression is operably 
linked to a reporter sequence. 

3. The method of claim 2, wherein the method is an in vitro cell-based transcription 

assay. 

4. The method of claim 3 wherein: 

the combining step comprises contacting (a) a cell comprising (i) the regulator 
operably linked to the reporter sequence, and (ii) the transcription factor, with (b) the 
candidate agent, under conditions wherein, but for the presence of the candidate agent, the 
regulator and the transcription factor form a first association resulting in a first expression of 
the reporter sequence; and 
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the detecting step comprises detecting the presence of a second expression of the 
reporter sequence, wherein a difference between the first and the second expression 
indicates that the candidate agent is an agent that modulates the association of the 
transcription regulator and the transcription factor. 

5. The method of claim 4, wherein the detecting step comprises detecting a 
colorimetric or luminescent signal resulting from expression of the reporter sequence. 

6. The method of claim 4, wherein the reporter sequence is detected by hybridization 
to a probe nucleic acid specific for the reporter sequence. 

7. The method of claim 4, wherein said cell is an endothelial cell. 

8. The method of claim 7, wherein said endothelial cell is a mammalian cell. 

9. The method of claim 8, wherein said mammalian cell is selected from the group 
consisting of human cells, bovine cells, and rodent cells. 

10. The method of claim 4, wherein said cell is transfected with the regulator 
operably linked to the reporter sequence. 

1 1. The method of claim 10, wherein said transfected cell is transiently transfected. 

12. The method of claim 1, wherein said regulator of gene expression consists of the 
polynucleotide sequence presented as SEQ ID NO:32. 

13. The method of claim 1, wherein said regulator of gene expression consists of X 
contiguous nucleotides, wherein (i) the X contiguous nucleotides have at least about 90% 
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identity to Y contiguous nucleotides derived from SEQ ID NO:32, (ii) X equals Y, and (iii) 
X is greater than or equal to 50. 

14. The method of claim 13, wherein X is greater than or equal to 500. 

15. The method of claim 1, wherein the method is an in vivo transcription assay. 

16. The method of claim 15, wherein a transgenic animal comprises said regulator 
operably linked to the reporter sequence. 

17. The method of claim 15, wherein 

the combining step comprises introducing the candidate agent into the transgenic 
animal, under conditions wherein, but for the presence of the candidate agent, the regulator 
and the transcription factor form a first association resulting in a first expression of the 
reporter sequence; and 

the detecting step comprises detecting the presence of a second expression of the 
reporter sequence, wherein a difference between the first and the second expression 
indicates that the candidate agent is an agent that modulates the association of the 
transcription regulator and the transcription factor. 

18. The method of claim 17, wherein said introducing is accomplished via a route 
selected from the group consisting of oral, intravenous, intramuscular, transdermal, and 
muscosal. 

19. The method of claim 1, wherein said agent is a candidate for inhibiting 
angiogenesis. 

20. The method of claim 1, wherein said reporter sequence encodes a light 
generating protein. 
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21. The method of claim 20, wherein the sequences encoding the light-generating 
protein are obtained from either procaryotic or eucaryotic sources. 

22. The method of claim 21, wherein the light generating protein is a luciferase. 

23. An isolated polynucleotide comprising, the sequence presented as SEQ ID 

NO:32. 

24. An isolated polynucleotide comprising, a cis-acting transcription regulator 
having X contiguous nucleotides, wherein (i) the X contiguous nucleotides have at least 
about 90% identity to Y contiguous nucleotides derived from SEQ ID NO:32, (ii) X equals 
Y, and (iii) X is greater than or equal to 50. 

25. The isolated polynucleotide of claim 24, wherein X is in the range of 50-3570 
including all integer values in that range. 

26. The isolated polynucleotide of claim 24, wherein X is greater than or equal to 

500. 

27. An expression cassette comprising 

a cis-acting transcription regulator comprising X contiguous nucleotides, wherein (i) 
the X contiguous nucleotides have at least about 90% identity to Y contiguous nucleotides 
derived from SEQ ID NO:32> (ii) X equals Y, and (iii) X is greater than or equal to 50, 
wherein said regulator is operably linked to a reporter sequence. 
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ABSTRACT OF THE DISCLOSURE 

METHODS AND COMPOSITIONS FOR SCREENING FOR ANGIOGENESIS 

MODULATING COMPOUNDS 

The present invention relates to novel promoters, including transcription regulator 
regions, for the mouse VEGFR-2 receptor, isolated polynucleotides comprising such 
promoters, nucleic acid constructs comprising such promoters operatively linked to genes 
encoding a gene product, such as, a reporter, a protein, polypeptide, hormone, ribozyme, o 
antisense RNA, recombinant cells comprising such nucleic acid constructs, screening for 
therapeutic drugs using such cells (e.g., screening for compounds that modulate VEGFR-2 
mediated angiogenesis), and endothelial tissue-specific gene expression using these novel 
promoter sequences. 




Srfl (3884) 



Luc-R 



Ascl (3944) 
Sail (3938) 
Hindlll (3932) t 
Notl (3924)* 
Pmel (3916)' 
Asp718l (3910) 
Ndel (3905) 
Pad (3892) 
Srfl (3884)' 



Swal (2206) 
Sbfl (2214) 




Luc-YG 




Swal (2206) 



Sbfl (2214) 



Srfl (3884) 
Pad (3892) 
'Ndel (3905) 
Asp718l (3910) 
Pmel (3916) 
Notl (3924) 
Hindlll (3932) 




Swal (2206) 



Sbfl (2214) 



Srfl (3884) 
Pad (3892) 
Ndel (3905) 
Asp718l (3910) 
Pmel (3916) 
Notl (3924) 
Hindlil (3932) 



FIG. 2 




Notl (5553) 



FIG. 3A 



1 £™SS£ ercCOX^ TICCTAGTTA ACTICATGGT "tAAAGAAGCC TCACCCGGGG AGGGTCTCGT GCCACAGAAG GAAGGOT3OT 

AGGTGGGTGG ACAAAGAGTG CAGGGGCCGG AAGGATCAAT TGAAGTACCA ATTTCTTCGG AGTGGGCCCC 7GGCACA0CA CQGTOICTTC CTTCCCAOGA 

101 2S5£S^ SS^f raTCT CTCATTTftCG GAGAGCACCT GAGCCCAGTG AGAGTCTTCT CTGTCCCTCA ATCGGTTCTG AAATTCCCCA CITCCCfWV 
GGGTOITCGG GGGTCACAGA GACTAAATCC CTCTCGTOGA CTOGGGTCAC TCTCAGAAGA GACAGGGAGT TAGCCAAGAC TTTAAGGQCT fflS 

201 TTATQCAGGG GACAQGGCTG OCCACCCTAT TCAGGACAGT AGTCTTAAAC TCGTAQCCAA CAGACTTTTT ATTGGGCIGG GAGAAAGAGA Tram™™ 
AATAGGTCCC CTCTCCCGAC GGGTGGGATA AGTCCTGTCA TCAGAATTTG AGCATCGGTT gSaaII JSS SS S 

301 S^SST^SS ?^S? 3GGCT CTGATrCCTA CTICTCAGAG GTCGGGCAGC CCAGCCAATA CTCAGCAATC GAGCGTGGGT AGGGAGGATT CaCAGAOirr 
CITCGAGTCG GCTCACCOGA GACTAAGGAT GAAGAGTCTC CAGCCCGTCG GGTCGGTTAT GACTCGTEAC CIOGCAOOCA TOCCTOCTAA 

401 ffTS?^ TTCTAACGTT GACTOGGTAG TATTTCTCTC AAAGAAAGAA TGGAAAAAGG GTCATCTCAG ATTCTGCCTG ATCCTGTCCA (row.. 
TGAGCGGCCC AAGATTCCAA CTGAGCCATC ATAAACAGAC TITCTTTCTT ACOTTICC CAAT™ TAaSS TA^aS S 

501 GX-JLVITIUI' CAGAAGGGAA AGTGAACATC CACCAAGCAG ATAATCTCAC CATCTACAGG CTCTOITCAG CACCCAGGGA OCAAGACCTO 

CnCCTATTT CCGAAAAAGA GTCTTCCCTT TCACITGTAG GTGGTTCGTC TATTACAGTG GTAGATGTCC GaEaCAAOTC 

601 CAGGCAAGGC CTAGCCAAAA CCAGTCTAAG GAGTAGAAAG GGGCTCCCAC CTOCAGAGAA GAAATAGACG CTCTGAATGG GC-irra-ara-p rw.,™,-, 
GTCCGTICOG GATCGGTTTT GGTCAGATTC CTCA1CTTTC OCCGAGGGTG GAGGTCTCTT ™ GAGACTTACC S 

701 ^SSS2S TCCA TATCMAATC ATAGTTGTTC TAGGTTCCTA GCCCACTCTC CTCGCTGGAG AACAAAGAGA ACCAGATTCA ACGTCATGAA CGACOnrAn. 
TCGGICAGGT ATAGTATTAG TATCAACAAC ATCCAAGGAT CGGGTCAGAG GAGCGACCTC TOTpS 

801 ?ST25 TCTC TGGCCACGCC CTCGGCGTGA ACGMAGCGC TTICGGCTTC TACGCTTAGA CTTCTGTTTT TTOGCITCGG CAGAOTOGGA 

AGCTCGAGAC OGACGCAGAC AOCGGTCOGG GAGCCGCACT TGCTATCGCG AAAGCCGAAG ATCCGAATCT GAAGACAAAA AACCGAAoS 

901 ^SSS^ ^S™^ TOCGGCCGGC CATAGCAGCG TCCACTTTCC CTCGCACACC ATCCCAGTTC CGSCTCATCA ATTGGGGTIC TCTCGCTCCA 

attcctcggt cactocatct aogcogggcg gtatcgtcgc aggtgaaagg gaccgogtcg taoggtcaag gckactact taatocSag aga^S 

1001 ^T^SS AATGCACTTC GCAGATTCTG GCTITCATrr CTCCAGCAAG Q n UlUUiO: TATCEATTTA TCTATCTTTA TCTATCTATC 

= agacattgtc ccttocccaa tiacgtgaac cgtctaagac cgaaactaaa gaggtcgttc caacagacag atagataaat agatagaaat agaSm 

'^S 01 x^S^™™ ^ TATOTATC TATCTATCEA TCATCTAOCT ACCTACTTAC CTATCTATCT ATCTATCTAT CIATCATCTA CCTACCTACT TACCTATCTA 

yj atagatatat agatacatag atagatagat agtagatgga tggatgaatc gatagataca tagatagata gatagtagIt SggaS 

S 01 SST^SJT I!?™™ "TICTriGA AACAGGATCT TAGCACCTAC CTATGGCTGG TITCCAACTC ACTATCAAGC CATAACTCGC CTOTTAACTC 
jjj GGATAAATAA ACAAACAAAC AAAAGAAACT TT3TCCTAGA ATCGTCGATC GATACCGACC AAACGTTGAG SSS OTA^S 

® 01 ^5 ITCCCTGT GTCTCTGAGT GCTGGGATTA AAAGCATGTG CCACTACACC CAGCTCCAGT AGGACCTTTA GAACACA1TT GOTATCCOTT 

^ TGTTICTAGG TGAACGGACA CAGAGACICA CGAOCCTAAT TTTCGTACAC GGTGATGTCG GTCGAGGTCA ScAAAT SSIaI 

^ 01 S^SSSS; ^!^ ICAG TCCCCAGGCC CCAGCCTCCC TGTCTAGAGC tttttcccat cctctctcca ctctatccct tcaatctcig CCCCATCCGA 

^ CGGATTCIGT GTG1TCAGIC AGGGGTCCGG GGTCGGAGGG ACAGATCTCG AAAAAGGGTA GGAGAGAGGT GACATAGgS 

i^ 01 SJSSSSS S^ 00100 ^ CTCCTTCTCC TGTCTTAGGC AAAGTCCAAG GTATCGGATC CAAATAGAGC CAAGCCTCAT CCCCCAAAAG TCAACAGAAG 

^ TTGGGGAGTC GCGCGTCGGG GAGGAAGACG ACACAATCCG TTTCAGGTrC CATACCCTAG GTTTATCTCG GTTCGGAGtI S " 

W 01 ^STE^? ^GSGCAAA CAGCTCTTGA TCGATGGTGT CACAGTTCCA GGCCCCTCCC CTCGAAGCCC CCACTATCAC AGCCCAGTTT CCAGAGAAAG 
,-J GTTTCAGA1C GGTCTCGTIT GTCGAGAACT AGCTACCACA GTCTCAAGGT CCGGGGAGGG GACCTICGGG gStGaS SS gSSS^ 

S 01 AA ?^ A ^S7 TOCTCTCCCT CCATACCAGA GGATCTGCCC CAGAAGAGGA GTTCGAAAAT GTICTCCCAG CTCTCCCGCT GAAGCAAGGC AAAGTOCTCA 
^ TTCGGTCGGA ACGAGAGGGA GGTATGGTCT CCTAGACGGG GTCTTCTCCT CAAGCTTITA CAAGAGGGTC GACAGGgSa 

* X S^^S I^^E ™ TGCTC ctcaaattcg TACTCCCAGT ACIGCTICCC TCAGGAGCAG AACAGCTGGC 

TTCTGCCGAC TGTCTCTCGA CGGAAGCGTG AGGAGGACCG ACCCAACGAC GACTTTAAGC ATGAGGGTCA TGACGAAGGG ACTCCTCGTC TTCTCGACCG 



1901 SSSTS^ GGCAGAGA GS AATCATGGAA TAGAACAGGG ACTCCACCAC CTCCCCCCTT CTCCTCCACC CTCAGTACCC TTCAAGAAGT 

TAGTCCICIC TAGACTOGTT CXCTCTCTCC TIAGXACCIT ATCTTGTCCC TGAGGTGGTG GACGGGGGAA GAGGAGGTCG GACTCATGGG 

2001 f^^^X < ZSZ5E IG TAACGGTCGG CAGGAAGGGC GAACGCTGCA TCAACATTGT CTGGTATGCC ACTGAAGCCT TCGGAGAT3T TICGGGGATA 
TCTGGGAAAG GGCCGGTCAC ATTGCCACOC GTCCTICCCG CTTGOGACGT AGTTGTAACA GACCATACGG S SS AA^St 

2101 S^SS SSS? 3 ^ C ^ ACCC ^ AGACAGAGAT CAGAAGGGTG AGGACATACC GCTGGCCACA GAAGCAGTCC 

TGGTCCCAGG TCCTOGGGTA GGAGTITCGC GGTCATCACT GATGGGACTT TCTGTCTCTA GTCTTCCCAC TCCTOTATGG CGACCGGTCT CTTCGTCAGG 

2201 SSTS? 010 ACCTOCTCCT GGAGTCCCTG ACTGCTTK3T CTICACAGCT CCCCAGCACG TCCATGGCAC CCTTTACCTT GCCTCAGACT 

ATATAGGATT TGACCGACAG TGGACGAGGA CCTCAGGGAC TGACGAAACA GAAGTGTCGA GGGGTCGTGC A^ACCGtS GGAAaS 

2301 r^S^ SSSTS^ agtaggtctt cccctgacag ttcatgcgag tgaaggcagc atcgatoggg ccctcaatcc cccagacatc tiggataagt 

ATCCAGACCA TGGAACTKT TCATCCAGAA GGGGACTGTC AACTACGCTC ACITCCGTCG TAGCTACCCC GGGAGTTACG GGCTCTOTAG AACCTATTCA 

2401 IT^T^ SS^T^S TOCCGTC ^ TOTAGCTCAT AGCAGTACTG CCCTAGAACA GGGGAAACTG TGTGAGAAGC AGATGAGCCT AAGGCAGATC 
AACCCCATGG GTCCGSAGTC ACGGCAGAGT AGATCGAGTA TCGTCATCAC GGGATCTTOT CCCCTTIGAC ACACTCTTCG TCTACTCGGA TTCCGTCTAG 

2501 SSffSS^S SfSS? 010 CATAGAGTCA CCICGGAAGG CAAAGAGGGA CCCATTCTIG AGATCCGTGA AGGCGTCAAA GGGCTITCCA CTGCACAGOT 
GCTGGCGGTG GTCTGGACAG GTATCTCAGT GGAGCCTICC GTTTCTCOCT GGGTAAGAAC TCTAGGCACT S GaScaI 

2601 AAACTCAGGG GTCCCTTGAT CAGT3GTGTC GGGCCTTAGG ATCTOCTCCT GTTGCTCCAC TTTAGGCGCT GGGGTCCTK3 GCTOTTCmc 

GAAGGAGACC TPTGAGTCCC CAGGGAACTA GTCACCACAG OCCGGAATCC TAGAGGAGGA CAACGAGGTC AAATCCGCGA SS SS 

2701 JS SS^™* GGTCACCX3GG TGGAGAGGTG TTCTCGGGTT GCACACCGGT GITCGTATIG 

TCCTAGATCC TTCCGACAGC CGAAATCTCA CGGCAGGCAG GCTCCTAAAT CCAGTCGCCC ACCTCTCCAC AAGAGCCCAA CGTGTGGCCA CAAOCATAAC 

2801 aagaST ^T aatcat cctctcgcat a =tgaacacg TCCCCCCGCG TTACTGCAGG CAGAACGGGG AGCAGTOAGT 

AAGAACCCGA GGAGGTGCAT CAGTATCGAG GTTATTAGTA GGAGACCGTA TCACTTGTGC AGGGGGGCGC AATGACGTCC GTCTIGCCCC TCGTCACTCA 

2901 SS TOGAACICA CCITCGGGCT TGCACTGCTC CATGTAGTCG GCACAGCAGC TCTGATAGTA 

CAGTCCGACA CCTCCCTCGG GGTCCGGGTG GGTGGTCCCG AGACTTGAGT GGAACCCCGA ACGTGACGAG GTACATCAGC CGTCTCGTCG AGACTATCAT 
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3001 AGTGCAAAGC TOGTCACACT GACACTTCTT GCTQGCCATG AAACCCTGAG TCCAGCGGCC CTTGCATGAC TCTATOGGAG GGAATATCAG 

TCACGinCG AGCAGTCTGA CTGTGAAGAA CGACCGGIAC TTTGGGACTC ACGTCGOJGG S AGATACCCTC SS 

3101 CMTCTAGGG CACCTOCCCA ACCTGCACTT CCCTAGGTAC CCACCAATCC. CCTCCCACAC CTTGGTCAGC CAGAGAAACC CATCCCACPA nnrarror™.,. 
GTTAGATCOC GTGGACGGGT TGGAOGTGAA GGGATCCATG GGTCGTTAGG GGAGGGTGTG SaSS G^Sg gSSgsS 

3201 Sff^^S CTCAQGGGTG CCATOGCAGG CCTCTAGCCC AGGGCCITCG CAAGCTGGGC GCGGAGCTTC TOGAATCTCG CTCTCCmrv tou,,,^, 

cttiticccg gagtccccac ggtaccgtcc ggagatcxx* ixxcqgaacc gttcgacccg S 

3301 SS^^ ^SS TTC CTAGTICCCT «3QTTTCTGC CCTPIATTTG CTCATCCTCT GGCCCAGCCC CATTCCCCTC CTCCAAACAC AGCTGCAGPA 
TOGTCTGACT TCTICTCAAG GATCAAGGGA CCCAAAGACG GGAAATAAAC GAGTAGGAGA CCGGGTCGGG GTAACGGGAG GAGGTTTCTG SS 

3401 AAGGGTCACA TICCCAGAAC CCCAGCCCCA GGAGAGCTGG GAAACAGAAA ACCCTCGCCA AGACCAAAGT CAGTAGGGTC AOGGGCAOGA rotatm^ 
TTCCCAGTCT AAGGGTCTTC GGGTOGGGGT CCTCTCGACC CnTGTCTTT TGGGAGCGGT G^CMDCCAG SS 

3501 TOGAAAOAAG CATGTGTTGT CACCCTCTGA GCCAGTCCCG TEAATCTCCC TGAGCCTTAC TTTTTATAAA GTCGGACCAT 

CGAATCGAAT CGACCCCTOC ACCTHCTIC GTACACAACA GTCGGAGACT CGGTCAGGGC AATTAGAGGG ACTCGGAATG AAAAATATTT 

3601 ???59?^ CT ^ TCAGGT GTK3RGAGAT TCCCTGAQCT AGAACAGACA AAACGTTTCG TGCCTQGAGT AGCMTCCAAC TCATOXCAT AAGCOGTEAI 
CCACGGAACG GAGTAGTCCA CAACTCTCTA AGGCACTCGA TCTTGTCTGT TITCCAAAGC ACGGACCICA S SS 

3701 ^TJSJf^ OTAGGTCCTT GTCCCATCCT ACCCCCCGCT TCGAATCTGG ATTTITCGGG CAAGAAGGGG GGTTOGGGGA GAGCTCGCAA 

GCTAAATCAC AAACTAGTCC GATCCACGAA CAGGGTAGGA TGGGGGGCGA AGCTTAGACC TAAAAACCCC CTTCTTCCCC cSaOC^CT 

3801 GCACTITGGG GGAGGTTITC TTTDCITCTC ATAAAAGAAC AAAGCTTCAT TTCTGGCCTC TCCTIGTICT CTCTAAGCTC GGTCTITACAG CATAGTAATT 
CGTCAAACCC CCTCCAAAAG AAAAGAAGAG TATTTTCTTG TTTCGAAGTA AAGACCGGAG AGGAaSaGA " 

3901 ACTGGGTCAG AGTCTATTCT TCTTTCTITA TITmTEAG ATTTATTTAT TTTATCTITT GTGTATAAGT GTCTGCTCAC ATCTOCATCT OTCPATPAra 
TCACCCAGTC 1CAGATAAGA AGAAAGAAAT AAAAAAAATC TAAATAAATA AAATACAAAA CACATaSS 

4001 I^S^T S^ ATGG AGGTCAGAAG MGGCTITGA ATACCCTQGA ACTGGAGTTT TCAACAGTTA TGAGCTGCCG TQTGGATGCT GAGAATCAAA 

acgtacagaa cacagatacc tccagtc-itc tcccgaaact tatgggacct tgaccicaaa acitoicaat actcgacggc acaccS 

i^ 01 '^^^ CTGTAAGAAC AAGTACTCTT AAAGGCTGAG CCATCTTTCC AGTCCCAGAG CCCATICCTG AGGCTITCAC TAATCCATTG ATCOICGCKT 
J GGGTCCAGGA GACATTCTTG TTCATCAGAA TTTCCGACTC GGTAGAAAGG TCAGGGTCTC GGGTAAGGAC SS 

Si" 1 ^TS^ ^™ TITA AAAAAAAAAT GGACTCATIG GGCATACTIT CTAGACTCAC ATACTAAGTG GGATTTCTCT 

J CTOGTGGGAC CGGTGTGGAA GTTACTGGAG TAAATAAAAT TTnTTTTTA CCTGAGTAAC CCGTATGAAA GATCTGAGTG TATGATTCAC CCTAAAGAGA 

ji 01 ^S^S 5 TOGACTGCC AGGTTITGGG CCAAATTCCA AGCACTGGCA CACTTCTGAA GOCCCTCCGT TTTCTCTICT GTAATCACAG 

y, TATTTCTTCA CGAGTGACCC CATCTCAOGG TCCAAAACCC GGTTTAAGGT TCGTGACCGT GTGAAGACTT CGGGGAGGCA AAAGACAAGA CATTAGTQTC 

^S^^ COTTOGTGTC TCTTCTCTAT GGACCGCAGT AGTCTCAGCG GCAAAATCAA ACACTAAATT TTACTCCCTA CAGACGCGTG AAGCOTAAGT 
S£ CGCTCGCACG GAAACCACAG AGAAGAGATA CCTGGCGTCA TCAGAGTCGC CGTTTTACTT TGTGATFTAA AATGaS SS 

® )1 SS^SS^ ^TjP^fSSSS TrrAAGAA 'rc TCAACTGCGA TTCTTTAACC ATCCGGAGGG GACGTGGATA CATGTAGCCA GCTTGCTTCC ACATTTTGGG 
b CCTTIGGCCG TAATTTCCCG AAATTCTTAG AGTTGACGCT AAGAAATTGG TAGGCCTCCC CTGCACCTAT GTACATCGGT CGaSS 

W 1 ^ES?' AGGA AATGGAAGAC AGCTCTTTAC AGCCCTTTCT ACAGCATCTT GCACACCACC AAGGGGAGAC TOGGGAGAGG AGGCGGAGm 

^ CTOGGCTCGC TCGCCATCCT TKCCTTCTG TCGAGAAATG TCGGGAAAGA TGTCGTAGAA CGTCTGGTGG TTOCCCTCTG AOOCCTCTOC TOCGOCTCQG 

fa 01 CTGGCTGGAG ACCTGGGGTA GGCTTGCQCC TGCGTCGGGG GCGGAGCCCG TGAAACCTAG AGGCGGGGCG TCAAATCCTT GACTCTCCW 

^ TCCACACCCG CACCGACCTC TGGACXXCAT CCGAACGCGG ACGCAGCOOC CGCCTCGGGC ACTTTGGATC TTCGCOCOGC AGTTTOGGAA CTOAGACGAC 

^ ^SS^ ?S? rI ?? K3T TCAGCATCTr AGCTOCGCTG TGCTTAGATT GGAGCAGCGC TITGTTCCGG GCACCGGCGT CTCTACCCTC COSCGTCTOr 
^ GAGTCTCCGC ACCAAOGACA ACTCGTAGAA TCGAQGCGAC ACGAATCTAA CCTOGTCGCG AAACAAGGCC CGTCGCOGCA GAGATCGGAG GGOGCAGAGC 

Sl ?E? CT0CCTT CATGCOCITC CTAAGTCGCT GAGTCCCGGA GCTCCCCTCC TCCTICIGCT TCTACACTTG TAGCCCAGCA COTFEACCGG 

AGGTACGAAG AGAGAGGGAA GTACGGGAAG GATTCAGOGA CTCAGGGOCT CGACGGGAGG AGGAAGACGA AGATGTCAAC IS S 
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1 ^STTSS S5^ QC0GCTG CAAAGTATAT ACCCGGTQGT TAGCAGAAGC TGAGAACttT TAGCCGAAAG CCGGCTCCCT AAGCCGAAGC 

CGTOGACCCG TITGCAACCS CTOCGGCCAC GTITCATATA TGGGCCACCA ATCGTCTTCG ACTCTTGAAA ATCGGCTTIC GGCCGAGGGA TTCGGCTTCG 

101 Tf^^T* S^GAAA AAGAAAAAAA AAATICCAGA GAAGCTIXXA GAGCCTCCTC CICTICCCTC TTCCTICftAA AGGACTCCAA GTCCXXBOIC 
ATCCGTTCAT CCCCTPCTIT TICTTTITIT TITAAGGTCT CTICGAAGGT CTCGGAGGAG GAGAAGGGAG AAGGAAGTTT TCCTGACGTT CftfiGoS 
201 ACCCTCCACC CAGCAAGAGT TAGGGCCICG AACCCCGGTC ACGCTGCCTC CGCCTCCTGC CGAACGTAAC GGGGGACCCG TCCGTAAAGT eirwwyy-, 
TOGGAGG1GG GTCGTICTCA ATOCOGGAGC TTGGGGCCAG TGCGACGGAG GCGGAGGACG GCTTGCATTC COcSS a!cS£ 

301 SSi^^ ^^^E? GGGCACGCAC AGGCCX3CAGC CCCTCCGCCC GCCCCGCCCC TCACGTCCGG GCACGTICTA TTITCGAACG CCGAGGCCAO 
CCTERGGAGG CAGACTGCGC CCCGTGCGTG TCCGGCGTCG GGGAGGCGGG CGGGGCGGGG ACTGCAGGCC OGtcSaGAT AJaS 

401 ^SSSS CCTGGCTITC TGMTOGCTG TCGCGCGCAG CTTTAGCCAA TCAGCGTICC CTTCCTATIT GTAGAGCGTA GCTCCCTICC 

caaogattcc ctcccccgtc gcaccgaaac actaaccgac agcgcgcgtc gaaatcggtt agtcgcaagg gaaggataaa cmScgcat cgagggaag^ 

501 T^I^ ?25TTSP?5S S 1 ^ 0000 GTCTCCAAGA GGAGAGCTAG GATTCTTGTC GCGATCGGGA CTCGTTOTCA CCCCATCGTC TCCGAGGACT 

aaogaaaaac accaagaagg gcacgaccoc cagaggtict cctctcgatc ctaagaacag cgctkgccct gagcaacagt ggggtaocag aogctcSS 

TCTCTGGflCC TCCTCTGTTG TCATAAGCTA GAGGCTTTTG GCTGAGTGTT AGCGGCTCTA AGGGGGAACT GAAGGCCTCA TCCTTCTCAG GCACACATAT 
ACACACdGG ACCAGACAAC AGTATTCGAT CTCCGAAAAC CGACICACAA TCGCGGAGAT TCCCCCTIGA CTtoSt AGgS£ 

701 ^E rCFhGA CACTCAGICC TTCCGAGGTG TTCAAACACT AGATGAGCTA GCCTACGGAG AGGCAGCCAG GTGGTCTCTA AAAGGTCIGC 

TGCACGAGGA CTCGAGATCT GTGAGTCAGG AAGGCTCCAC AAGTTIGTGA TCTACTCGAT CGGATGCCTC I^GTCGGTC SmAT mcS 

801 CTDCCCTTRG TTCCCAGGCT CTCATTGGCC AGGGATICAG CCCOTCCCTC GCCACGCCCC CTAGAGTAGT TAAGCCTCTA GGATTCCACT TCCGGGAAGT 
GAAGGGAATC AAGGGTCCGA GACTAACCGG TOCCTAAGTC GGGAAGGGAG CGGTGCGGGG GATCTCATCA ATTCGGAGAT otSS 

901 ?SSS TGATC GACGCTTCrT GGGGACGCAG ATCCTATCTC ACCCCATCCC CTGCAAGACA GTCTGAGAGA TTCTCGCTGT CACTTITCIC 

CCCOCCCOCC CCCGCACIAC CTGCGAAGAA CCOCTGCGTC TAGGATACAG TGGGGTAGGG GACG1TCH3T CAGACTCTCT AAGAGCGACA QiSaS 

1001 TGCC1ATCAG TTCACTGAAA CCTGTCAGTC TCACTGGGAA GAGACAGACA CTCGGAAGGG ATCCTCTCAA CTCTTAGGCC GGTCCCCCAA CMH7mr.A 
ACGGATAGTC AAGTGACTIT GGACAGTCAG AGTGACCCIT CTCTGTCTGT GAGCCTTCCC TACGAGAOTT GAGAATCCGG 

1101 fS^^ ^^ 1QCGG GAGCCCTCAT GCAGTCGGGG GBaMM - lW GTGTCAGTCG AGAGGAAGGC TTGGCTAAGG CCTCTCCCTC TCCCTCCCTC 
TGACOCTftGA GGCGGACGCC CTCGGGAGTA CGTCAOCCCC CACACAAACA CACACTCAOC TCTCCTTOCG AACCGATTCC GGAGAGGGAG AGGGAGGGAG 

1201 T^T^ STSS^S T ' ITrGGOTGTA TOTCTGTCTG AATGTCTGTG GCTCCATCCC GGGAGTTTCT CACCAGGTCC TCTCCAGCCT CCTCTCCCAC 
^ ACfiOCACCCC CAACCCCCCA AAACCGACAT ACACACACAC TEACAGACAC CGAGGTAGGG CCCTCAAACA GTCGTCCAAG ACAGGTCGGA GGAGAGGGTG 

S 1301 £gSS£ iJK!^ S^?^ TICACCACCC GCTGGAACCG TCCAfiCCITT CCCCGAGGAA GAAGGAGGAG GTAGAAQGCA 

== GGTGGGGGGG TGTGGATTCT CAGTCGTTGG GCXCCACACT AAGTGGTGGG CGACCTTGGC ACGTK3GAAA GGGGCTCCTT CTICCICCTC CATCTTCCGT 

^ 1401 ^TST CTCATT AACCACTGCG TCACGGTGTA GTGGAAGGGT GGGTGTTGTG GCTTTTTGCC TCTGACACAC ACATCCACAC CCGCTCACCC 

J| CAACTTGTCT TAGAGAGTAA TIGGTCACGC AGTGCCACAT CAOCITCCCA CCCACAACAC CGAAAAAOGG ACACTOTOTG TCTAGGTCTG ScSgS 

pl501 TGTGCTCACT CACAGGGTCG GTCTCICTTA TCTCTCTTGG GCGTGTGTCT GTCGGTGGCT TIGTTTCTCT GTCTACGCCT GTGTCTGTOT crrrrirrrr 
ACACGAGTGA GTGTCCCAGC CAGAGAGAAT AGAGAGAACC CGCACACACA CAGCCACCGA A^S SSSI 

!yi601 ST^S^ SS^? 10 ™ GGGAAATGCC CGGCTCCITC GTGCCAACGG TCAC03CAAT CACAACCAGC CAGGATCTIC AGTCGCTCGT GCAACCCACC 
%! CATCCTCACG CGGCCAGAGC CCCTmCGG GOCGAGGAAG CACGGITCCC AGTGGCGTEA GTGTTGGTCG GTCCxS S 

f 1701 CTCATCTCTT CCATGGCCCA GTCCCAGGGG CAGCCACTGG CCTCCCAGCC TCCAGCTOIT GACCCTTATG ACATGCCAGG AACCAGCTAr ir-aarrrrir 
- GAGTAGAGAA GGTACCGGGT CAGGGTCCCC GTCGGTGACC GGAGGGTCGG AGGTCGACAA SSa^ SS 

- 1801 S^TSSSTSS STfS^5SS T GGCGGGGCAA gcggaagtcg TGGGCCTTCA accagcacaa CCACCAGTGG ACCTOK3TCT GCCOGTCCAG CCAGAGCCAG 
CGGACTCACG GATGTCGTCA CCGCCCCGTT CGCCTTCACC ACCCGGAAGT TQGTQGTGTT GGTGGTCACC TGGACACAGA CGGGCAGGTC GGTCTCGGTC 

IP 01 ™S ^5^? "SSS™ 35 GATCGftGGAG OCTAGCTAGG GATGTCGGCT CAGPITOIAC AGTGCCTTCC 

^ CGGATCxECT GGGGCTCTTC TCTGTCATTC ATACTCCGGA GTCCTCAACC CTACCTCCTC GGATCGATCC CTACACCCGA GTCAAACATC TCAOGGAACG 

^ 2001 SS^T ^SSS 1 y 3C ? CftGCAT MGCCAGGAG TGGTTATGCA GACCTGTAAC CCCAGCTCTC AGAAGGTGGA GGCAGGAGGA GCAGGAGTTC 
^ ACGGTACGTA CTICTAGG3A TCGTGICGTA OTCGGTCCTC ACCAATACGT CTGGACATIG GGGTCGAGAG TCTTCCAOCT CCGT^TCCT OGI^TCAAG 

« 101 SgSSSS T^ETfST GOCTGCaCTG CAAGAGATCA TEATTTTCAA AAGTTGGCCT TGGGGGGAGG T3GGTCAGGG AAGTAAGAGA 

f1 CTCCGGTCGG ACACGATGAA TACCTCAGGT CGGACGTGAC GTICTCTAGT AATAAAAGTT TTCAACCGGA ACXXCCCTCC ACCCACTCCC TTCATTCTCT 

^ 201 SJSESSST ^^f TTTGTCA CmATAGTr GGAGG1TCCT CTCAGGCCTC AAGTCTGAAG GAACTTTACC ATICIGGCCA GTCAGGAGTA GGGGTTATXA 
•nCACTGTCA TTAAAACAGT GAATEATCAA CCTCCAAGGA GACTCCGGAG ITCAGACTIC CITGAAAT3G TAAGACCGGT CCCCAMAM 

2301 iJSIST f^T9^ S^SS 0 "GAI^EAT GGTCCTTATC TCTGACTCAG CTTACCCCAG AAGAAGAAGA 

AAACOCCAAG TCCTCCTTCC TTCAAAAGAA TCCCGACTAT CTCCATGGGG GTCTAGAGTA CCAGGAATAG AGACTGAGTC GAATGGGGTC TTCTICTICl' 

2401 S SSS^ jggS!^ ^SS^ GGAACCCTCG GftQGGAGCTO ACAGATCGAC TICAGGCGGT AAGGAGGAGT 

TnCGCTICC CAAGCGTCTC T0C3CCTTCTT CGACCGACGT CGATTCACGT CCTTCGCAGC CICCCTCGAC TGTCTAGCTG AAGTCCGCCA TTCCICCTCA 

2501 ^SSSST ^"^EE? TOCTGGSSGC ACTCTGCCTT GTECITCCCC CGTTTCTCAC TGIGOCTGTG TCCTAAAOGA GGAAACCCCC TCTEAGGGAA 
GftCCCOCSCA GAACTCCGGC ACGACCCTCG TGAGACGGAA CAAGAAGGGG GCAAAGAGTG ACACGGACAC AQGATTTGOT oStTGGG^ 

2601 "^^^ TGGAGTQQCT CCATATGCAT GCTCAGACCC ATOCCCACIT ACTTTCGACT GTTCCCCACT 1TCCCTCAAT ATGTCCCCAC 

GTCCOCAGIC A1ATCCGACT ACCTCACCGA GGTATACGTA CGAGTCTGGG TACGGGTGAA TGAAAGCTGA CAAGGGGTCA AAGGGACTTA TACAGGGGTG 

2701 mTS?SSE5 T CCTGGCTITC TCTCftGOCTA AGGAGACAAG CTAGAGGAGG TAATTCTCTC ACCTICTTIT CTTCACTAAA TAATAATCCA TTTIQCCTK 
TACAGTGGGA GGACCGAAA G AGAGTCGGAT TCCTCTGTTC GATCTCCTCC ATTAAGAGAG TGGAAGAAAA SaGTGATT? SS 

S^S^a T^^SS ATCTACCTST cgtagttcag ccctcctccc ccaacttgat AGCCTCAAGT TTCAGCCCTT GGCIGAGATC 

GACGGAGGTA AAAAAAAAGG ACTCGACCCC TAGATGGACA GCATCAAGTC GGGAGGAGGG GGTTGAACTA TCGGAGTTCA AAGTCGGGAA CCGACTCTAC 

2901 G^AG^AgS SS^T TMTITCTGC TAAGTCAATT CCTTGTCTCC TACTICAGCT ATCTACAGTlf CTGCCGAACT TGAGCTGGTG 

GGTAGTAGGA CTCACCGAGA CCGACCTTTG ATAAAACACG ATTCAGTTAA GGAACAGACG ATCAAGTCGA TAGATGTCAR GACGGCTTGA ACTCGACCAC 

3001 ISTE?? 7??™^ GTCCMCCCC CCACACACAA AACTICATGC CTCCOCCTTO AAACCAGGGT GCGTCTCTGA 

OGCGGGTGGT TCGGGTGAAG AAAGAGAGAA AAAATGGAGT CACGTTGGGG GGTCTGTGTT TTCAAGTACG GACGGGGAAC TITCGTCCCA CGCAGAGACT 

3101 AACAGAACCT CATTAAAAAC AACACATAAG CATTACCTAC TGACTCAACA AACTGTAGTG TTTTICTITT 

GAGGGGCAGC OCTCCGACTT CCTCTACCCA TTCTCTTGGA GTAATTTTTG TTGTGTATTC GTAATGGATG ACTGAGTTGT 1TCACATCAC AAAAAGAAAA 

3201 AAAATTATTT CGITIGTITA TTIATTATIT GdTATGTTT GAGTGAGTGC TQGTGCACCA CAGCACACAT ACGAGGTCAG AGGGAAATTT 

AAGGAGAGTT TTTTRATAAA GCAAACAAAT AAATAATAAA CGAATACAAA CICACTCACG AOCAOTTGGT S SS S^AaI 

3301 ^SS^ TICTCTCCTT CCGTOITCTC GGTGCTTGCT GGCAATCTCC TTCACTCAGT GAGCTACAAT GCCCCCTTCT GCCCTTTAAG GCAGAGTACT 
AGTATCAAAC AAGAGAGGAA GGCACAACAC CCACGAACGA CCGTTAGAGG AAGTCAGTCA CTCGATGTTA CoSSaAGA CGGGAAATTC CGTCTCATCA 

3401 gII^S^ SSSSS I^S^ ga^caa atoitcacca tcacaccagg cttggagttc tigcctatca gtcacgtcca 

GGAATCATGT CCCCCTGGGA AAGGAGCCGG AGAGTTTCAA CTCTAATGTT TACAAG1GGT AGTOTOGTCC GAACCTCAAG AACGSATAGT CACTOCAGGT 

3501 SETEST* ^STES AA £^ TCTIT XAGTCTSATG GGGAAACCGA GGCACGAGTA GCATCGTCTA CCAGGATTTC dCTTAGGGG ACX3GTCCCCT 
GAGGACGGAT CGAAGAAGGG TTGGTAGAAA ATCAGACTAC OCCTTTGGCT CCGTCCTCAT CGTACCAGAT GGTCCTAAAG GAGAATC^C TOCCAGGGGA 
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3601 CAGTTGGGAG GGAGCTGTCC AGCCCCCTGG ATCAGCAGCA AGAATGTATG AGTCTGGGGT TQGQCGGGTG AAGCTACTCT GTGTQGTCGC TGACCAGCAA 

GTCAACCCTC CCTCGACAGG TCGGGGGACC TAGTCGTCGT TCTTACATAC TCACACCCCA ACCCGCCCAC TTCGATGAGA CACACCAGCG ACTGGTCGXT 
3701 TTCTCCTTTC TCTCTCTCCT ATGACCTGGC CCTQCTGGGA TCCATTAGGA AACTGATCAG CTTGAAGAGG AAAAGGCAGA GCTGGAGTCG GAGATCGOCG 

AAGAGGAAAG AGACAGAGGA TACTGGACCG GGACGACCCT AGGTAATCCT TTGACTAGTC GAACTTCTCC TITTCCGTCT CGACCTCAGC CTCTAGCGGC 
3801 AGCTGCAAAA AGAGAAGGAA CGCCTGGAGT TTGTCCTGGT GGCCCACAAA CCGGGCTGCA AGATCCCCTA CGAAGAGGGG CCGGGGCCAG GCCCGCIGGC 

TCGACGTTTT TCTCTTCCTT GCGGACCTCA AACAGGACCA CCGGGTGTTT GGCCCGACGT TCTAGGGGAT GCTTCTCCCC GGCCCCGGIC CGGGCGACCG 
3901 CGAGGTGAGA GATTTGCCAG GGTCAACATC CGCTAAGGAA GACGGCTTCG GCTGGCTGCT GCCGCCCCCT CCACCACCGC CCCTOCCCTT CCAGAGCAGC 

GCTCCACTCT CTAAACGGTC CCAGTTGTAG GCGATTCCTT CTGCCGAAGC CGACCGACGA CGGCGGGGGA GGT3GTGGCG GGGACGGGAA GGTCICGTCG 
4001 CGAGACGCAC CCCCCAACCT GACGGCTTCT CTCTTTACAC ACAGTGAAGT TCAAGTCCTC GGCGACCCCT TCCCCGTTGT TAGCCCTTCG TACACTICCT 

GCTCTGCGTG GGGGGTK3GA CTGCCGAAGA GAGAAATGTG TGTCACTTCA AGTTCAGGAG CCGCTGGGGA AGGGGCAACA ATCGGGAAGC ATCTCAAGGA 
4101 CGTTTGTCCT CACCTGCCCG GAGGTCTCCG CGTTCGCCGG CGCCCAACGC ACCAGCGGCA GCGAGCAGCC GTCCGACCCG CTGAACTCGC CCTCCC1TCT 

GCAAACAGGA GTGGACGGGC CTCCAGAGGC GCAAGCGGCC GCGGGTTGCG TGGTCGCCGT CGCTOGTCGG CAGGCTGGGC GACTTGAGCG GGAGGGAAGA 
4201 TGCTCTGTAA ACTCTTTAGA CAAACAAAAC AAACAAACCC GCAAGGAACA AGGAGGAGGA AGATGAGGAG GAGAGGGGAG GAAGCAGTCC GGGGGTGIGT 

ACGAGACATT TGAGAAATCT GTrTGTTTTG TTTGTTTGGG CGTTCCTTGT TCCTCCTCCT TCTACTCCTC CTCTCCCCTC CTTCGTCAGG CCCCCACACA 
4301 GTGTGGACCC TITCACTCIT CTGTCTGACC ACCTGCCGCC TCTGCCATCG GACATGACGG AAGGACCTCC ' mmUlTlT GTOCTCTGTC TCTCGTTTTC 

CACACCTGGG AAACTGAGAA GACAGACTGG TGGACGGCGG AGACGGTAGC CTGTACTGCC TTCCTGGAGG AAACACAAAA CACGAGACAG AGACCAAAAG 
4401 TGTGCCCCX3G CGAGACCGGA GAGCIGGTGA CTTO3GGGAC AGGGGGTGGG GCGGGGATGA ACACCCCTCC TGCATATCTT TGTCCTGTTA CITCAACCCA 

ACACGGGGCC GCTCTCGCCT CTCGACCACT GAAACCCCTG TCCCCCACCC CGCCCCTACT TGTGGGGAGG ACGTATAGAA ACAGGACAAT GAAGTTGGGT 
4501 ACTTCTGGGG ATAGATGGCT GACTGGGTGG GTAGGGTQGG GTGCAACGCC CACCTTTGGC GTCTTACGTG AGGCTGGAGG GGAAAGAGTG CIGAGTGTCG 

TGAAGACCCC TATCTACCGA CTGACCCACC CATCCCACCC CACGTTGCGG GTGGAAACCG CAGAATGCAC TCCGACCTCC CCTTICTCAC GACTCACACC 
4601 GGTGCAGGGT GGGTIGAGGT CGAGCTGGCA TGCACCTCCA GAGAGACCCA ACGAGGAAAT GACAGCACCG TCCTCTCCTT CTTTICCCCC ACCCACCCAT 

CCACGTCCCA CCCAACTCCA GCTCGACCGT ACGTGGAGGT CTCTCTGGGT TGCTCCTTTA CTGTCGTGGC AGGACAGGAA GAAAAGGGGG TGGGTCGGTA 
4701 CCACCCTCAA GGGTGCAGGG TGACCAAGAT AGCTCTGTIT TGCTCCCTCG GGCCTTAGCT GATTAACTTA ACATTTCCAA GAGGTTACAA CCTCCTCCR3 

GGTGGGAGTT CCCACGTCCC ACTGGTTCTA TCGAGACAAA ACGAGGGAGC CCGGAATCGA CTAATTGAAT TGTAAAGGTT CTCCAATGTT GGAGGAGGAC 
4801 GACGAATTGA GCCCCCGACT GAGGGAAGTC GATGCCCCCT TTGGGAGTCT GCTAACCCCA CTTCCCGCTG ATTCCAAAAT GTGAACCCCT ATCTGACTGC 

CTGCTTAACT CGGGGGCTGA CTCCCTTCAG CTACGGGGGA AACCCTCAGA CGATTGGGGT GAAGGGCGAC TAAGGTTTTA CACTTGGGGA TAGACTGACG 
4901 TCAGTCTITC CCTCCTGGGA AAACTGGCTC AGGTTGGATT TITTTCCICG TCTGCTACAG AGCCCCCTCC CAACICAGGC CCGCTCCCAC CCCTOIGCAG 

AGTCAGAAAG GGAGGACCCT TTTGACCGAG TCCAACCTAA AAAAAGGAGC AGACGATGTC TCGGGGGAGG GTTGAGTCCG GGCGAGGGTG GGGACACGTC 
;5001 TATTATGCTA TGTCCCTCTC ACCCTCACCC CCACCCCAGG CGCCCITCGC CGTCCTCGTT GGGCCTTACT GGTTTTGGGC AGCAGGGGGC GCTCCGACGC 

ATAATACGAT ACAGGGAGAG TGGGAGTGGG GGTGGGGTCC GCGGGAACCG GCAGGAGCAA CCCGGAATGA CCAAAACCCG TCGTCCCCCG CGACGCTGCG 
5101 CCATCTTGCT GGAGCGCTTT ATACTGTGAA TGAGTCGTCG GATTGCTGGG CGCGCCGGAT GGGATTGACC CCCAGCCCTC CAAAACTTTT CCIGGGCCTC 

GGTAGAACGA CCTCGCGAAA TATGACACTT ACTCACCAGC CTAACGACCC GCGCGGCCTA CCCTAACIGG GGGTCGGGAG GTTTT3AAAA GGACCCGGAG 
5201 CCCTTCTTCC ACTTGCTTCC TCCCTCCCCT TGACAGGGAG TTAGACTCGA AAGGATGACC ACGACGCATC CCGGTGGCCT TCTTGCTCAG GCCCCAGACT 

GGGAAGAAGG TGAACGAAGG AGGGAGGGGA ACTGTCCOTC AATCTGAGCT TTCCTACTGG TGCTGCGTAG GQCCACCGGA AGAACGAGTC CGGGGTCTGA 

:|30i TTrrcrcrrr aagtccttcg ccttccccag cctaggacgc caacttcicc ccaccctggg agccccgcat cctctcacag aggtcgaggc aatittcaga 
"1: aaaagagaaa ttcaggaagc ggaaggggtc ggatcctgcg gttgaagagg ggtgggaccc tcggggcgta ggagagigtc tccagctccg ttaaaagtct 

5401 GAAGTTTTCA GGGCTGAGGC TITGCCTCCC CTATCCTCGA TATTTGAATC CCCAAATAGT TTTTGGACTA GCATACTTAA GAGGGGGCTG AGTTCCCACT 

CTICAAAAGT CCCGACTCCG AAACCGAGGG GATAGGAGCT ATAAACTTAG GGGTTTATCA AAAACCTGAT CGTATGAATT CTCCCCCGAC TCAAGGGTGA 
5501 ATCCCACTCC ATCCAATTCC TICAGTCCCA AAGACGAGTT CTGTCCCTTC CCTCCAGCTT TCACCTCGTG AGAATCCCAC GAG1CAGATT TCTATTTTCT 
j TAGGGTGAGG TAGGTTAAGG AAGTCAGGGT TTCTGCTCAA GACAGGGAAG GGAGGTCGAA AGTGGAGCAC TCTTAGGGTG CTCAGTCTAA AGATAAAAGA 
£601 AATATTGGGG AGATGGGCCC TACCGCCCGT CCCCCGTGCT GCATGGAACA TTCCATACCC TGTCCTGGGC CCTAGGTICC AAACCTAATC CCAAACCCCA 

TTATAACCCC TCTAGCCGGG ATGGCGGGCA GGGGGCACGA CGTACCTTGT AAGGTATGGG ACAGGACCCG GGATCCAAGG TTTGGATTAG GGTTTGGGGT 
5701 CCCCCAGCTA TTTA1CCCTT TCCTGGTTCC CAAAAAGCAC TTATATCTAT TATGTATAAA TAAATATATT ATATATGAGT GTGCGTOTCT GTGCGTGTCC 

GGGGGTCGAT AAATAGGGAA AGGACCAAGG GTTTETTCGTG AATATAGATA ATACATATTT ATTTATATAA TATATACICA CACGCACACA CACGCACACG 
f 801 GTGCGTGCGT GCGTGCGTGC GAGCTICdT GTTTTCAAGT GTGCTGTGGA GTICAAAATC GCTTCTGGGG ATTTGAGTCA GACTTTCTGG CTGTCCCTTT 

CACGCACGCA CGCACGCACG CTCGAAGGAA CAAAAGTTCA CACGACACCT CAAGTTTTAG CGAAGACCCC TAAACTCAGT CIGAAAGACC GACAGGGAAA 
5901 TTGTCACTTT TTTGTTGTTG TCTCGGCTCC TCTCGCTGTT GGAGACAGTC CCGGCCTCTC CCTTTATCCT TTCTCAAGTC TGTC1CGCTC AGACCACTIC 

AACAGTGAAA AAACAACAAC AGAGCCGAGG AGACCGACAA CCTCIGTCAG GGCCGGAGAG GGAAATAGGA AAGAGTTCAG ACAGAGCGAG TCTGGIGAAG 
6001 CAACATGTCT CCACTCTCAA TGACTCTGAT CTCCGGTCTG TCTGTTAATT CTGGATTTGT CGGGGACATG CAATTTTACT TCTGTAAGTA AGTGT3ACTG 

GTTGTACAGA GGTGAGAGTT ACTGAGACTA GAGGCCAGAC AGACAATTAA GACCTAAACA GCCCCTGTAC GTTAAAATGA AGACATTCAT 1CACACTGAC 
6101 GGTGGTAGAT TTTTTACAAT CTATA1CGTT GAGAATTCTG GGTGGAAATG TCTGATCAGG AGAAGGGCCT GCCACTGCCG ACCACAATTC ATT3ACTCCA 

CCACCATCTA AAAAATGTTA GATATAGCAA CTCTTAAGAC CCACCTTTAC AGACTAGTCC TCITCCCGGA CGGTGACGGC TGGTGTTAAG TAACTCAGGT 
6201 TAGCCCTCAC CCAGGCTGTA TTTG-TOATTT TTTTCATTTT GTTTTITrGT ATTTTCCACC TGACCCCGGG GGTGCTGGGG CAGTCTATCA CTCGGCAGCT 

ATCGGGAGTG GGTCCGACAT AAACACTAAA AAAAGTAAAA CAAAAAAACA TAAAACGTGG ACTGGGGCCC CCACGACCCC GTCAGATAGT GACCCGTCGA 
6301 CCCCTCCCCC CCTTGGTTCT GCACIGTCGC CAATAAAAAG CTTTTAAAAA ACTGTATCCT TCAGGTCAAA GTCTCTGTTT TCCCTGGACA TCTACTACAT 

GGGGAGGGGG GGAACCAAGA CGTGACAGCG GTTATTTTTC GAAAATTTTT TGACATAGGA AGTCCAGTTT CACAGACAAA AGGGACCTGT AGATGA1GTA 
6401 GGCTTCCITr CAGAAAAACG GAGTTTGGAT TGCTAGGGAA GTCTTCCTGG CACTTAGTGG GACGCCTAAC GAATCAGAAC CTACAACGGG ACTAAAAGGA 

CCGAAGGAAA GTCTTTTTGC CTCAAAOCTA ACGATCCCTT CAGAACGACC GTGAATCACC CTGCGGATIG CTTAGTCTTG GATGTTGCCC TGATTTTCCT 
6501 AGTGGAGACT TGCTAGGTTT TCCCATGTTC CCAGGCTGGG CCACCTACTT GAAAAAATAA GGGGCGGAAA AGTGTAAGGT ACCAAATTTG GIGAAGGGTC 

TCACCTCTGA ACGATCCAAA AGGGTACAAG GGTCCGACCC GGTGGATGAA CTITTTTATT CCCCGCCTTT TCACATTCCA TGGTTTAAAC CACTIGCCAG 
6601 TGGGAGAATT TCATGATCGG AAAAGAATTT ATTCACCTTG GGTGTGCAAT GAACTTTCAG CAACAGTTAA GGGCAAGGGT GTAAAAGCTG GGCACAACTT 

ACCCTCTTAA AGTACTAGCC TTTTCTTAAA TAAGTGGAAC CCACACGTTA CTTGAAAGTC GTTGTCAATT CCCGTTCCCA CATTTTCGAC CCGTGTTGAA 
6701 GTAAATCCTA GCATTTGAGA GGTGGAGGCA AGGGGATCAA CTGGTGGAGT TCAGTGTCAT GTGGATCGTA GATACCAAGC GCAAAGATCT GCTATGGGGA 

CATTTAGGAT CGTAAACTCT CCACCTCCGT TCCCCTAGTT GACCACCTCA AGTCACAGTA CACCTAGCAT CTATGGTTCG CGTTTCTAGA CGATACCCCT 
6801 GAGGGCTTGG TACACCAGGG GAGCCAGAAG TTTCGTGGTG AGGGTAGTGG AGGGCAAGTG GAGAGTGAGA GTTAGCCTCA GGGAGATTCT ACAGGCAATG 

CTCCCGAACC ATGTGGTCCC CTCGGTCTTC AAAGCACCAC TCCCATCACC TCCCGTTCAC CICTCACTCT CAATCGGAGT CCCTCTAAGA TGTCCGTTAC 
6901 ATGCAGAGTT CAGACGCTCC CTTTGAAAGC ACTAGAGAGC CGCAGCAGGT TTTGAGCAGA GAAGGTTAGA GTTAGGTOGT C1CTICTAGC CCATCCCAGG 

TACGTCTCAA GTCTGCGAGG GAAACTTTCG TGATCTCTCG GOGTCGTCCA AAACTCGTCT CTTCCAATCT CAATCCACCA GAGAAGA1CG GGTAGGGTCC 
7001 CTGAGGAGGA CGCTGAGGGT TTCAAGAAGG ATCGAGAATC GAAAGCAGAG GAGAAGAAGG ATCCAAGAGG CATGGAGGAG GCAGAACACA 'mClCl' I CT 

GACTCCTCCT GCGACTCCCA AAGTTCTTCC TAGCTCTTAC CTTTCGTCTC CTOTCTTCC TAGGTTCTCC GTACCTCCTC CGTCTTGTGT AAAGAGAAGA 
7101 TTAATAGCAA GCCTGGAAAG GATAACTTGC TGCAGGAGGA GATGCTCACC AGTCGQGTSG TCTAGGGGGT TCTTGGAAAA GAGAAGGCAT TTGCTCAAGC 

AATTATCGTT CGGACCTTIC CTATTGAACG ACGTCCTCCT CTACGAGTOG TCAGCCCACC AGATCCCCCA AGAACCITIT CTCTTCCGTA AACGAGTICG 
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7201 CTCGGTTCCC CCATTCTOQC TCTTCTGTCA GCTTGTCTTC CATTAAGTGT GTGTCTCAAG GCCACCCTGC TCAGGACTCC TTCTGAGACG AGCTTCTATC 
GAGCCAAGGG GGTAAGAGCG AGAAGACAGT CGAACAGAAG GTAATTCACA CACAGAGTTC CGGTGGGACG AGTCCTGAGG AACACICTCC TGGAAGATAC 
7301 CTCGAGTICA TTAAAAACAC AATTOCCTGG TOCCGTGCTC TCTCCACTCG CICAGTTACC TCAAAAGACC AGGGCTAAAG GTCTGATCAC AACTCTATCC 
GAGCTCAAGT AATTITTGTG TTAACGGACC ACQQCACGAG AGAGGTGACC GAGTCAATQG AGTTTTCTGG TCCCGATTTC CACACTAGTG TTGAGATAGG 
7401 CCATTACTGC TCCAACGCAG AGACAGGACT GAGCCGGAGT GAACAAATGA ACAAAAATGA CTAATAATGC ATGCGTGATT AAATACATAA AAGAGCAGAT 
GGTAATGACG AGGTTGCGTC TCTGTCCTGA CTCGGCCTCA CTTGTTTACT TGTITITACT GATTATTACG TACGCACTAA TTTATGTATT TTCTCGTCTA 
7501 GACTGGATGA GCAAATCGTT TAAGGAGAGA CAGCAAGATC CTAGAATTTT GGAGACTAAT TTAAATCCAT CTTTGAGATG CATTIGGTCG GAAATTCCTG 
CTGACCTACT CGTTTAGCAA ATTCCTCTCT GTCGTICTAG GATCTTAAAA CCTCTGATTA AATTTAGGTA GAAACTCTAC GTAAACCAGC CTTTAAGGAC 
7601 GGAGGAAAAA AAGTGTAAAT ATGAAGAGAG AATAAATGAG AATAGGGGTG GCTICAGAGA GGTTAACTGC GCGCTCGTCG CTTTTGTACA AGAATGTGAA 
CCTCCTTTTT TTCACATTTA TACTTCTCTC TTATTTACTC TTATCCCCAC CGAAGTCICT CCAATTGACG CGCGACCAGC GAAAACATGT TCETACACTT 
7701 TTGCAGGGAG CAAAATGGGA TAGATACTCC OGCCCGAAAG GTGGAATTGA ACCACTCTGT CGCTAAACAG CTACAGGTTT GAAGCCTGCA CCCCAGACCA 
AAOGTCCCTC GTTTTACCCT ATCTATGAGG GCGGGCTTTC CACCTTAACT TGGTGAGACA GCGATTTGTC GATGTCCAAA CTICGGACGT GGGGTCTGGT 
7801 CTGAGGATCA TCCGGGCGAA AGGAGCTATT TTCAGTTAGT TATATAAAGG CGAGATACTA CTACTTTTTA CACTTATGGT CATTATTIGT GGTATACAGT 
GACTCCTAGT AGGCCCGCTT TCCTCGATAA AAGTCAATCA ATATATTTCC GCTCTATGAT GATGAAAAAT GTGAATACCA GTAATAAACA CCATATGTCA 
7901 AGATAATTAA TTTCAATQGT TICGAACATT TTTTTTCACT TTTTCTTGTG AACATGTGTT TCCTCAGTAA AGTGTTCCGT GAATGACTCT ACTAACTAAA 
TCTATTAATT AAAGTTACCA AAGCTTGTAA AAAAAAGTGA AAAAGAACAC TTGTACACAA AGGAGTCATT TCACAAGGCA CTTACTCAGA TGATTGATTT 
8001 AAGTAAGTAG CTTCATTTGC ATAGCGCCTT GCATTTTGGG AAGCAGCGCC TAAAGTGCCT GTCTCCCTAA CTAAAAGCAG AATTTTTTGC AAAGTCAAAA 
TTCATTCATC GAAGTAAACG TATCGCGGAA CGTAAAACCC TTCGTCGCGG ATTTCACGGA CAGAGGGATT GATTTTCGTC TTAAAAAACG TTTCACTTTT 
8101 GTCAGTTTTA TTTTTCTTTG TTIGTTTGCT TGTTTGTTTT TAATGGAAAA ACTTCTCACG CXSGCCGATTC GTAGCAGAAT TCGAGATTTT CTGCAAGCGA 
CAGTCAAAAT AAAAACAAAC AAACAAACGA ACAAACAAAA ATTACX^TTTT TGAAGAGTGC GCCGGGTAAG CATCGTCTTA AGCTCTAAAA GACGTTCGCT 
8201 GAAGCAAGAC TITCGTAGGG TCTGACGGCA CGOGGCCGCA GAGCGACACC TGCCGTTGCT TTATAGAACT GCAAGTATGT AGGGAATCTA CTGAGTCCCT 
CTTCGTICTG AAAGCATCCC AGACTGCCGT GOGCCGGOGT CTCGCTGTGG ACGGCAAGGA AATATCTTGA CGTTCATACA TOCCTTAGAT GACTCAGGGA 
8301 AGGTGATGGA GTTGACAACC AACTCCCCTT GAGTTTAGAC GCTAAAAACC ATCCCTTITT ATATTTATGT GATTAGCCCA GGGAAACTAA GGdCAGACA 
TCCACTACCT CAACTGTTGG TIGAGGGGAA CTCAAATCTG CGATTTTTGG TAGGGAAAAA TATAAATACA CTAATCGGGT CCCTTTGATT CCGAGTCIGT 
8401 TGGATAATAC CACAGCCGAG TTCTTGTAGC CCAACTCCCT AGGGGAAATG AAACCTACAG T TC TOCr i 'm 1 AATATGCTTG GCCCAGGGGC AGTCGCCCTA 
ACCTATTATG GTCTCGGCTC AAGAACA1CG GGTTGAGGGA TCCCCTTTAC TITGGATGTC AACACCAAAA TTATACGAAC CGGGTCCCCG TCACCGGGAT 
^ 8501 TTGGCAGGAG TGGCCTTATT AGCGGAGGTG TACCTTGTTA GAGAAGTGTG TCACTTGGAG GCGAGGTTTT GAGGTACGTA TGCTCAAGTC TGGCCAGTGT 
*f AACCGTCCTC ACCGGAATAA TCGCCTCCAC ATGGAACAAT CTCTICACAC AGTGAACCTC CGCTCCAAAA CTCCATQCAT ACGAGTTCAG ACCGGTCACA 

8601 GATCCTGGCT GTCTGCAGAA CGTGGTCTCC TTCTGGCTGC CTICGGATCA AGGTGTAGAA CTCTCAGCTC CTTCTCCAGC ACCATGTCTG CCTGCTTAAT 
; 2 CTAGGACCGA CAGACGTCTT GCACCAGAGG AAGACCGACG GAAGCCTAGT TCCACATCTT GAGAGTCGAG GAAGAGGTCG TGGTACAGAC GGACGAATTA 

8701 GCTTTGCTTC TTTCCATGAC GATAATGAAC TGTGCCTCTG AAACTGTAAG TCAGCCCCCC AGTTACATGT TTTCTTTTAT AAGAGTTGCA TATATATATG 
!"=s CGAAACGAAG AAAGGTACTG CTATTACTTG ACACGGAGAC TTTGACATTC AGTCGGGGGG TCAATGTACA AAAGAAAATA TTCTCAAOGT ATATATATAC 

8801 TATGTATATA TGTATGTATA TATGTATGTA TATATATATA TATATATAAA CAGGGTCTCA CICTTTAGCT CTGGCTGGCC TGAAATTCAC TATGTAGCCC 
:H ATACATATAT ACATACATAT ATACATACAT ATATATATAT ATATATATTT GTCCCAGAGT GAGAAATCGA GACCGACCGG ACTTTAAGTG ATACATCGGG 

8901 AGGATTGCCT GAACTTTGAA GCAATCTTCC TGCCTCAGCC TCCCAATGGT ATTACAGGCA TGAGTCACAA CAAGCCATTT AAATCTTATG ATGACTTATA 
^ TCCTAACGGA dTGAAACTT GGTTAGAAGG ACGGAGTCGG AGGGTTACCA TAATGTCCGT ACTCAGTGTT GTTCGGTAAA TTTAGAATAC TACTGAATAT 

" " 9001 AGAAGACAGA AAATCAGAGT TCCTTTACCT AGTICACAGA TCCCTACAAT CTAACCTCGT TCGCTCCATA AACAGCCCTA CCCCACCCTC CTCGAACTGC 
TCTTCTGTCT TTTAGTCTCA AGGAAATGGA TCAAGTGTCT AGGGATCTTA GATTGGAGCA AGCGAGGTAT TTGTCGGGAT GGGGTCGGAG GACCTTCACG 
|U9101 TTTGAGGAAT GCTGCAGGCT CTCACAGGCA CACTCCTCCT TGGTTAATCT CTTCAGCCTG GTTGCCTTCC CCCCCCATCT CCATGTGGCC CAAAGCCTCT 
^ .. AAACTCCTTA GGACGTCCGA GAGTGTCCGT GTCAGGAGGA ACCAATTAGA GAAGTCGGAC CAACGGAAGG GGGGGGTACA GGTACACCGG GTTTCGGAGA 

■ "9201 CATCCTGTTC TCAAATACCA CEAGCTAGTA AGGCTCCCCG ACCTGACCCG GTTTAAATAT TAGAAAAGGG TCACTTTCTC CCTGCCACAG ACAACCAAAC 
GTAGGACAAG AGTTTATGGT GATCGATCAT TCCGAGGGGC TGGACTQGGC CAAATTTATA ATCTTTTCCC AGTGAAAGAG GGACGGTGTC TGTTGGTTTG 
019301 CACCATATGC 1TCTCACTTA CTACCTGACT ATCAAGGTTA ATAGATGTCT TCACAACCTT TCTCTGAGCC TCAGTTTCCC CACCTGCATA ATGCATCTGA 
f | GTGGXATACG AACAGTGAAT GATGGACTGA TACTTCCAAT TATCTACAGA AGTGTTGGAA AGAGACTCGG AGTCAAAGGG GTOGACGTAT TACGTAGACT 

T»9401 GACACAGAAT TCCCTAGAGC TGTGGTTCTC CTCATTCCTA GTGCTGGGAC CCTTTAATAC ATTTCCICAT GTrGTGGTGA COCCACCACC ACCATAAAAT 
CTGTGTCTEA AGGGATCTCG ACACCAAGAG GAGTAAGGAT CACGACCCTC GGAAATTATG TAAAGGAGTA CAACACCACT GGGGTGGTCG TGGTATTTTA 
9501 TATTTCCATT GATACTTCAT AACTGTAATT TITTCTATTG TTATGAATAG TAATGTAAGC ATTTGTGTTT CCCAGTGATC TTAGATGACC CTGTGGAAGA 
ATAAAGGTAA CTATGAAGTA TTCACATTAA AAAAGATAAC AATACTTATC ATTACATICG TAAACACAAA GGGTCACTAG AATCTACT3G GACACCTICT 
9601 GTCATTCCAC CCCAAAGGGG TCCCCACCAC AAGTTAAGAA TTCCIGCCAT AGAGGAATCA CAGGGACCAT GGATTAACAC TIGGGTCGAC TTTTGGGCTG 
CAGTAAGGTG GGGTTTCCCC AGGGGTGGTG TTCAATTCTT AAGGACGGTA TCTCCTTAGT GTCCCTGGTA CCTAATTGTG AACCCAGCIX3 AAAACCCGAC 

9701 CCTTCTGGGA GGCGCTAGAG CTAATGACAG CTACATCAAT TTCTGAAATT TTGTGTGTGT GTGTGTGTGT GTGTGTGTGT GTGTGTGTGT GCCCTGAGTC 
GGAAGACCCT CCGCGATCTC GATTACTGTC GATGTAGTTA AAGACTTTAA AACACACACA CACACACACA CACACACACA CACACACACA CGGGACTCAG 

9801 GGGTGCTGAG ATAGGCCAGT GGCTTTAGTG TICCTGGACC CATTACTCAC CAGAACICTC CCCTCACCTC ATTCTTDGAT GTGAACACTA TCTCTTCATA 

CCCACGACTC TATCCGGTCA CCGAAATCAC AAGGACCTCG GTAATGAGTG GTCTTGAGAG GGGAGTGGAC TAAGAAACTA CACTTGTGAT ACAGAAGTAT 
9901 GTGGCX3GTGG CAATAGCAGC AACAGTGAAC TAAATTTTAA AAGTAGAACT CAGCTOGAGA TACAAATATT GCAGTTTTCA AGTTCGGGTG GATTGTCTAA 

CACCX3CCACC GTTATCGTCG TTGTCACTIG ATTTAAAATT TTCATCTTGA GTCGACCTCT ATGTITATAA CGTCAAAACT TCAACCCCAC CTAACAGATT 
10001 TAACTTAATA ACATAACCCA GAAGAGAGGC CCCTTGGTCT TGCAAACTIT ATATGCCTCA GTACAGGGGA ACGCCAGGGC CAAGAAGTGG GAGTGGGTGG 

ATTGAATTAT TGTATTGGGT CTTCTCTCCG GGGAACCAGA ACGTTTGAAA TATACGGAGT CATGTCCCCT TGCGGTCCCG GTICTICACC C1CACCCACC 
10101 GTAGGGGAGC AGGGTGGGGG GAGGGTATAG GGGACTTTCC GGATAGCATT TGAAATGTAA ATGAAGAAAA TATCTAATAA AAATTTGAAA AAAAATGTTA 

CATCCCCTCG TCCCACCCCC CTCCCATATC CCCTGAAAGG CCTATCGTAA ACTTTACATT TACTTCTTTT ATAGATTATT TTTAAACTTT TTTTTACAAT 
10201 CCCCAGTTTG GCCTGGATCT CACTACCTCA ACCAGACTGG CATGTGACTC TGCTGAGATC TGCCTACTIC TGCCTCCTGG GTGCAGAAGA CAATTTTTCG 

GGGGTCAAAC CGGACCTAGA GTGATGGAGT TGGTCTGACC GTACACTGAG ACGACTCTAG ACGGATGAAG ACGGAGGACC CACGTCTTCT GTTAAAAACC 
10301 AAGTTAGTTC TCTTCTTCCA TCTTGTGGAT TCCAGGGATT GAACTCGGGT CATCAGGCTT GGCTGCAAGT GACTTACTTA GGTGTCTCCC AGACCCTCTC 

TTCAATCAAG AGAAGAAGGT AGAACACCTA AGGTCCCTAA CTTGAGCCCA GTAGTCCGAA CCGACGTTCA CTGAATGAAT CCACAGAGGG TCTGGGAGAG 

10401 GGTTTGATTA GTTAGATGCT GCACTTCATG CCTGACTITC GCACTATGTA GATAGAGCAA TGTCTATAAC ATCTCCTACA ATGATATGTA TATCAAGAGC 
CCAAACTAAT CAATCTACGA CGTGAAGTAC GGACTGAAAG CGTGATACAT CTATCTCGTT ACAGATATTG TAGAGGATGT TACTATACAT ATAGTTCTCG 

10501 CAAGTGATCA GATGGCTCAG TGGGTAAGAG CACAGACTGC TCTTCCAAAG GTCCCGAGTT CAAATCCCAG CAATCACATA GTGGCTTCCA TTCCCTCTTA 
GTTCACTACT CTACCGAGTC ACCCATTCTC GTCTCTGACG AGAAGGTTTC CAGGGCTCAA GTTTAGGGTC GTTAGTGTAT CACCGAAGGT AAGGGAGAAT 

10601 TGGAATGTCT GAAGACTGCT ACAGTGTACT TACATATAAT AAATAAATAA ATCTTAAAAA AAAAAAACCC AGCCGGGCX3T GGTGGCGCAC GCCTTTAATC 
ACCTTACAGA CTTCTGACGA TGTCACATGA ATGTATATTA TTTATTTATT TAGAAITITT TTTTTTTGGG TCGGCCCGCA CCACCGCGIG CGGAAATTAG 

10701 CCAGCACTTG GGAGGCAGAG GCAGGCGGAT TCCTGAGTTC GACGCCAGCC TGGTCTACAG AGTGAGTTCC ACGACAGCCA GAACTACACA GAGAAACCCT 
GGTCGTGAAC CCTCCGTCTC CGTCCGCCTA AGGACTCAAG CTGCGGTCGG ACCAGATGTC TCACTCAAGG TGCTGTCGGT CTTGATGTGT CTCTTTCGGA 
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10801 GTCT CGAAAA AAAAAAGAGA GAGAGGGAAG TGAGAGCGCA ATAATCITAA CATTTCTGTG GTTGTCTTPG CTGTAGTCTA TTCTGATAAG CAATGCTQGC 

CAGAGCTTTT TTTTTTCTCT CTCTCCCTTC ACTCTCGCGT TATTAGAATT GTAAAGACAC CAACAGAAAC GACATCAGAT AAGACTATTC GTTACGACCG 
10901 TTGCTCCCAA GGTAGGAAGT AACATTTCTT TATAAAAGGT ATTTGCTCTG CTTTATTTIT CTGTTTTATT TATQGTGCTG AGGATCGAAC CCAGGACCCT 

AACGAGGGTT CCATCCTTCA TTGTAAAGAA ATATTTTCCA TAAACGAGAC GAAATAAAAA GACAAAATAA ATACCACGAC TCCTACCTTG GGTCCTGGGA 
11001 TGGCAAGCAA GGCTAGCTGT TTACCACTCA GCCATACTCC AGCCTIGCAC TGGGGGATTC TAGGCAAGGG TTCTACCACT GAGCCACACT CXXCACCCCC 

ACCGTTCGTT CCGATCGACA AATGGTGACT CGGTATGAGG TCGGAACGTG ACCCCCTAAG ATCCGTTCCC AAGATGGTGA CTCGGTGTGA GGGGTGGGGG 
11101 ATCCCTCTCT GGAAGATTCT AGGCAGTTCC ATACCTAGCC TTTGATCTTT TAAGACGGTC TTACTAGAGC TCAGTT 

TAGGGAGAGA CCTTCTAAGA TOCGTCAAGG TATGGATCGG AAACTAGAAA ATTCTGCCAG AATGATCTCG AGTCAA 
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10 20 30 40 50 60 70 80 90 100 

AAGCTTGCAGGGAGGTAGGAGGCAGCCTGTGGCGTTGATTCAATGCACCl^^ 
TTCGAACGTCCCTCCATCCTCCXSTCGGACACCGCAACTAAG^ 

110 120 130 140 150 160 170 180 190 200 

AGGTCTTGGGTGCTTAACATCTATTTTTACAAATCTTATTTAGCAACTTAGAACTGTGAAATATTGGAAAGCTACTTAAA 

TCCAGAACCCACGAATTGTAGATAAAAATGTTTAGAATAAATCGTTGAATCTTGACACTTTATAACCTTTCGATGAATTTGGAAGATTTGAGGGAGGAGG 

210 220 230 240 250 260 270 280 290 300 

ACACTATGAGAATGTTACATTTTCTATTCAGTTATTTTTGAGCAGTAAACAGATGAATCAAGGAATATGCCCATCACATCAAGAGTGCTCCTAAATGGAC 
TGTGATACTCTTACAATGTAAAAGATAAGTCAATAAAAACTCGTC 

310 320 330 340 350 360 370 380 390 400 

TTGCTTGTTATTCATTTAC&GTGTGGCCCCTTGACTTTCATC 
AACGAACAATAAGTAAATGTCACACCGGGGAACTGAAAGTA 

410 420 430 440 450 460 470 480 490 500 

TAAGAATACTTATCCCTACACAGGCCCTGGAGCC&GTTC 
ATTCTTATGAATAGGGATGTGTCCGGGACCTCGGTCAAGGGTCGTG^ 

510 520 530 540 550 560 570 580 590 600 

TCTTCTGTCTGAAAAC&CCAGGCACGCGTGCGGCTAC^ 

AGAAGACAGACTTTTGTGGTCCGTGCGCACGCCGATGTATGTTTGTACTTTCGTTTTATGTGTGTAATGTATTTATTTAGAATTTTTTACTAAGCCCCAC 

610 620 630 640 650 660 670 680 690 700 

GGGGAAGGAAAAAAAAGGATGTTAGAAAATCGATGTAACTGTTTTTTC 
CCCCTTCCTTTTTTTTCCTACAATCTTTTAGCT 

710 720 730 740 750 760 770 780 790 800 

AATTGTTTTCATTGCCCCC&AGTCTGCTAATAGAGCT 

TTAACAAAAGTAACGGGGGTTCAGACGATTATCTCGAACGATGGAAGTACCGACAGCATTCCTACTCCGTTTCTACCTGAAGTCGAAAGTCTGACACAGA 

810 820 830 840 850 860 870 880 890 900 

GCT<mAATGTTGGCTACTCCTGTTTTCTGACCCCCT^ 
CGAGTTTACAACCGATGAGGACAAAAGACTGGGGGAAGAGACCACGTT^ 

910 920 930 940 950 960 970 980 990 1000 

TATTTTATTTTATGTAATTGTATGTATATGCATGTC 

ATAAAATAAAATAC^TTAACATACATATACGTACAGTTATTCGTATACACACACACAAAGGTACCTTTGGTTCCGTTGTCTAAGA 

1010 1020 1030 1040 1050 1060 1070 1080 1090 1100 

TGGGCTGTGAGACGCCCACTGTGGGTGCTCGGAACCAAACTC 

ACCCGACACTCTGCGGGTGACACCCACGAGCCTTGGTTTGAGCCCAGGACACCTTTCTGTCGCTCGTGGGTATTACGTCTCCATAGAGAGTCTGAGATGA 

1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 

TTAAAATTTCAATTTATCTTTTTTTTTTTTAAA 
AATTTTAAAGTTAAATAGAAAAAAAAAAAATTTCAAGGTTC&TTG^ 

1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 

GTAGCACAACTTGGTCTGCTTCACATAAAGAATGGAAAGTCATT^^ 

CATCGTGTTGAACCAGACGAAGTGTATTTCTTACCTTTCAGTAATTTTGTGAGTAGTGTGACATTTCATCTTAACTTGAGACTGTCTTGTTCGCTTCACT 

1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 

GTCTGACTTCCAGGTAACTGAGCCTTCTTTTCCTC 

CAGACTGAAGGTCCATTGACTCGGAAGAAAAGGAGGATTTCTGTGTTCGGTATGTGTCTCATTTTATTTGAACCCGTACCACTCTTCCTTTGTTGCGTCC 

1410 1420 1430 1440 1450 1460 1470 1480 1490 1500 

AGGGCTAGCCAAGTCTGAGAGTCGTGAGTGTGCTCGGTTTATA 

TCCCGATCGGTTCAGACTCTCAGCACTCACACGAGCCAAATATTTGCCTCGGGTGGAACGGTCGCTCCATCAGTGTACGAGACGATTTGTCTTTGAATTC 

1510 1520 1530 1540 1550 1560 1570 1580 1590 1600 

AAAACACTTACACGAAGCAAACATGGGGAAGTGCCATGCAAGCATGTGAC^ 

TTTTGTGAATGTGCTTCGTTTGTACCCCTTCACGGTACGTTCGTACACTGACTGACCACCGTTACTGGCTTTGGTGTCGTCGGTGATCTTTTCCTTCCCA 

1610 1620 1630 1640 1650 1660 1670 1680 1690 1700 

AGTGCGCCACACTGTAGTTGTGAAAATGAACTTATTCATTTATTTTGAAA 

TCACGCGGTGTGACATCAACACTTTTACTTGAATAAGTAMTAAAACTTTTTGCACATTCTTCGTTTCTACAGAAGAAAGGGTGGATGGAAACGCCGTCC 

1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 

CGAGCACTTCCTGGAATTTATAAAGTGCGATCTTTCTGGGGACTTCTCATAACATTTCCTACTGCTCATCTATGTCTGTGTCAAATAGAGAATGCTCTTG 
GCTCGTGAAGGACCTTAAATATTTCACGCTAGAAAGACCCCTGAAGAGTATTGTAAAGGATGACGAGTAGATACAGACACAGTTTATCTCTTACGAGAAC 

1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 

AACAAGTGTGTCTGTGTGTGTGTGTGCGCGCGCACGCGCACTCACTCCT 

TTGTTCACACACACACACACACACACGCGCGCGTGCGCGTGAGTGAGGACGAGACAACTCCAGGTCAAAACTACCAGGGCGGTCTCCATATAAACTCATA 

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 

CATTTCTCAAGAGCTTCAGCTGGGAGACACTGCCTCTTACTGGCCTGAAGGTCACTAGCTGATTCATCTCCGTTTGGGCTGGCGCGCCTTGGGGATCCTC 
GTAAAGAGTTCTCGAAGTCGACCCTCTGTGACGGAGAATGACCGGACTTCCAGTGATCGACTAAGTAGAGGCAAACCCGACCGCGCGGAACCCCTAGGAG 

2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 

CTATCTCTCCTTCCCCAGTGCTGGGATAACAAGGTT 

GATAGAGAGGAAGGGGTCACGACCCTATTGTTCCAACCGTGGTGTACTCGGAAAATTTTACACTCAAACCTTCGAGTTTGCGTCCAAAAGTACGAACGTG 

2110 2120 2130 2140 2150 2160 2170 2180 2190 2200 

TGAAACTTCACAAGCTGAACCGTCTCCCTCTCCTTCCCTCTCTTTTTTCCTTTTCTTCTTCCTTTTTAAAACACATCTTGTCTTTAAAAAAAAAAAAAGG 
ACTTTGAAGTGTTCGACTTGGCAGAGGGAGAGGAAGGGAGAGAAAAAAGG 



2210 2220 2230 2240 2250 2260 2270 2280 2290 2300 

CCCAAAACAAGTGTAAAGTATTTCCCTATGTGTGTGGAGGGA 
GGGTTTTGTTCACATTTCATAAAGGGATACACACACCTCCCTC^ 

2310 2320 2330 2340 2350 2360 2370 2380 2390 2400 

CAAAGACGCATCGTTTCCTCTAAGAATTCTAAATGGGGCGATTACCACGGGCCTGCAGGTTCTGGTTTGTATTAGAGGAGACACTGTCTTCTTAAGTAAA 
GTTTCTGCGTAGCAAAGGAGATTCTTAAGATTTACCCCGCTA^ 

2410 2420 2430 2440 2450 2460 2470 2480 2490 2500 

ACATAGAAGGGGAAGTGTCCAGAATTGTAAATAAGGCTTCGAGAGAAGCCT 
TGTATCTTCCCCTTCACAGGTCTTAACATTTATTCCGAAGCT^ 

2510 2520 2530 2540 2550 2560 2570 2580 2590 2600 

ATCGTCCCTCCCTCTACCCAGATCTGACAGCCCTCCTT 
TAGCAGGGAGGGAGATGGGTCTAGACTGTCGGGAGGAACCGAGAAAACGACTC 

2610 2620 2630 2640 2650 2660 2670 2680 2690 2700 

ATTCTACCCTGTTCACAAGTAAATACACCTCTTAGC^ 
TAAGATGGGACAAGTGTTCATTTATGTGGAGAATCGATTCTCCG^ 

2710 2720 2730 2740 2750 2760 2770 2780 2790 2800 

CGATAGGTACACCAAGCAGCCTTCATACGGAGTTTTCATTCG 

GCTATCCATGTGGTTCGTCGGAAGTATGCCTCAAAAGTAAGCACTCCTCGACTTATATGTTGTTTCGATTTACACTCGTCTGGTCCGTACGGAGACGATT 

2810 2820 2830 2840 2850 2860 2870 2880 2890 2900 

ATGAGGATGCCCACACCAAACATGCCCAAGATCTTCAAGTATAATTTTATTATATAGATTCGCTATGTGTTGACATGTTTTTATAGTGAACCTGGATTTO 
TACTCCTACGGGTGTGGTTTGTACGGGTTCTAGAAGTTCATAT^ 

2910 2920 2930 2940 2950 2960 2970 2980 2990 3000 

ACAAACCCTCCTGGTTTGCCACCTGCTTCTGGC^CCATAC 
^ TGTTTGGGAGGACCAAACGGTGGACGAAGACCGTGGTATGAACTC 

iiH 3010 3020 3030 3040 3 °50 3060 3070 3080 3090 3100 

AAAGCACCATGTTACATCATTAATCATGCATATCAGTGTAGTTT^^ 

"f™ TTTCGTGGTACAATGTAGTAATTAGTACGTATAGTCACATCAAATCTAGGCTACATCTCTGTTATTAGAATAGAGAAACAGACCGACTTTCTGACAGGAA 

iffij 

ff$ 3110 3120 3130 3140 3150 3160 3170 3180 3190 3200 

TAAACTATCATTCTAAATGCATTTGGTTTTTGCCAG 
isfjj ATTTGATAGTAAGATTTACGTAAACCAAAAACGGTCCTCATTTTGTACAGTGTTCTATAAACAACAGTAAAGGGTCCGCACCTTCC 

r f 3210 3220 3230 3240 3250 3260 3270 3280 3290 3300 

jhQ AAAACGAGGGGTGAAGGCTGCTGTTCCTCTCTAGTCGCTACTTGAAGTCTACAra^ 

TTTTGCTCCCCACTTCCGACGACAAGGAGAGATCAGCGATGAACTTCAGATGTATCGACCCCCCCCCCCCCCCTGACAAGTGTACCCTGGCCAAAGGAGA 

jM: 3310 3320 3330 3340 3350 3360 3370 3380 3390 3400 

TTGTTCCTACACTGGCGCCTCTGGCAAGAAACTCTCCCT 
£ ^ AACAAGGATGTGACCGCGGAGACCGTTCTTTGAGAGGGAAGAGAAGGGGGGTTCGTATAGAACCGACTTTCCAGTCGAGACTTTTCCCCGGAC CGGTTTC 

3410 3420 3430 3440 3450 3460 3470 3480 3490 3500 

^ 5 TTACTGTAGGGGACCGTGGTCATGGAACTGGGTAGACAAAAGCACTCTAGCAGCCACTGGAGAAGGACCGGGGGCTCTTCTCTGTGC^ 
feQ AATGACATCCCCTGGCACCAGTACCTTGACCCATCTGTTTTCGTGAGATCGTCGGTGACCTCTTCCTGGCCCCCGAGAAGAGACACGTAAACGGGACCTC 

3510 3520 3530 3540 3550 3560 3570 3580 3590 3600 

CCCTGACCACCGCCAGCTCCCTGCATCTCCTTGCTA 

GGGACTGGTGGCGGTCGAGGGACGTAGAGGAACGATACCCAAAAGACCTGGCTCGGTCCGTCCTCAAGTGTTGGCTTTACAGAAGATCCCGATTAGTCCA 

3610 3620 3630 3640 3650 3660 3670 3680 3690 3700 

AACTTCGGACGATTTAAAGTTGCCAGATGGACGAGAAAACA 

TTGAAGCCTGCTAAATTTCAACGGTCTACCTGCTCTTTTGTCATCTCCGCAACCGTTGGACCTATTCGCGGATAGAAGATTAATTTTGTAAGTCTGCCCC 

3710 3720 3730 3740 3750 3760 3770 3780 3790 3800 

CGGGGGATG-CGGTGGCCAAAGCACCATAAAACAAAACTTCCA^ 
GCCCCCTAC-GCCACCGGTTTCGTGGTATTTTGTTTTGAAGGTTC^ 

3810 3820 3830 3840 3850 3860 3870 3880 3890 3900 

TGTCTTCATGCTCCCAACTGCGGGCGGATTTTTGGTCCCTTGGGACT 

ACAGAAGTACGAGGGTTGACGCCCGCCTAAAAACCAGGGAACCCTGAAAGTCACGTCGCCGCTTCTCTCAAGACGTGAACGTCCGAGGATTACTCCCGCG 

3910 3920 3930 3940 3950 3960 3970 3980 3990 4000 

AGTGGGCCTCGTGTTTCTGGTGATGCTTCCCAGGTTGCT 
TCACCCGGAGCACAAAGACCACTACGAAGGGTCCAACGACCCCCG^^ 

4010 4020 4030 4040 4050 4060 4070 4080 4090 4100 

CTGCGTCCAGATTTGCTCTCAGATGCGACTTGCCGCCCGGCACAGTTCCGGGGTAGTGGGGGAGTGGGCGTGGGAAACCGGGAAACCCAAACCTGGTATC 
GACGCAGGTCTAAACGAGAGTCTACGCTGAACGGCGGGCCGTGTCAAGGCCCCATCACCCCCTCACCCGCACCCTTTGGCCCTTTGGGTTTGGACCATAG 

4110 4120 4130 4140 4150 4160 4170 4180 4190 4200 

CAGTGGGGGGCGTGGCCGGACGCAGGGAGTCCCCACCCCTCCCGGTAATGACCCCGCCCCCATTCGCTAGTGTGTAGCCGGCGCTCTCTTTCTGCCCTGA 
GTC&CCCCCCGCACCGGCCTGCGTCCCTCAGGGGTGG^ 

4210 4220 4230 4240 4250 4260 4270 4280 4290 4300 

GTCCTCAGGACCCCAAGAGAGTAAGCTGTGTTTCCTTAGATCGCGCGGACCGCTACCCGGCAGGACTGAAAGCCCAGACTGTGTCCCGCAGCCGGGATAA 

CAGGAGTCCTGGGGTTCTCTCATTCGACACAAAGGAATCTAGCG p J ^ Q 

4310 4320 4330 4340 4350 4360 4370 4380 4390 4400 

CCTGGCTGACCCGATTCCGCGGACACCGCTGCAGCCGCGGCTGGAGCCAGGGCGCCGGTGCCCCGCGCTCTCCCCGGTCTTGCGCTGCGGGGGCGCATAC 
GGACCGACTGGGCTAAGGCGCCTGTGGCGACGTCGGCGCCGACCTCGGTCCCGCGGCCACGGGGCGCGAGAGGGGCCAGAACGCGACGCCCCCGCGTATG 



4410 4420 4430 4440 4450 4460 4470 4480 

CGCCTCTGTGACTTCTTTGCGGGCCAOSGACGGAGAAGGAGTCTGTGCCTGAGAACTGGGCTCTGTGCCCAGCGCGAGGTGCAGATG 
GCGGAGACACTGAAGAAACGCCCGGTCCCTGCCTCTTCCTCAGACACGGACTCTTGACCCGAGACACGGGTCGCGCTCCACGTCTAC 
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10 20 30 40 50 60 70 80 90 100 

AAATGTGCTGTCTTTAGAAGCCACTGCCTCAGCTTCTGCAGCTCAGATACCAAAGGAAGTCTGGTACACAGCATGATAAAAGACAATGGGACGGGGTCAC 
TTTACACGACAGAAATCTTCGGTGACGGAGTCGAAGACGTCGAGTCTATGGTTTCCTTCAGACCATGTGTCGTACTATTTTCTGTTACCCTGCCCCAGTG 

110 120 130 140 150 160 170 180 190 200 

AGTGGCTCCCGTCCCTTTCAGGGGTATGGAGACGAGCTGTAGAGAGATGTCTCCAGGGAGTTTTCATTAATCAGCAATTTAGTCAGATCTGTGCATCCTA 
TCACCGAGGGCAGGGAAAGTCCCCATACCTCTGCTCGACATCTCTCTACAGAGGTCCCTCAAAAGTAATTAGTCGTTAAATCAGTCTAGACACGTAGGAT 

210 220 230 240 250 260 270 280 290 300 

TGCTTTACAAGAAATGTCAGTGGGCCTGAGATCATCAGATGGAGGTTCATCGGGTTTCAATGTCCCGTATCCTTTTGTAAGACCTTGAAGTTGGCAACGC 
ACGAAATGTTCTTTACAGTCACCCGGACTCTAGTAGTCTACCTCCAAGTAGCCCAAAGTTACAGGGCATAGGAAAACATTCTGGAACTTCAACCGTTGCG 

310 320 330 340 350 360 370 380 390 400 

AGGAAAACAGGAACTCCACCCTGGTGCCGTGAATTGCAGAGCTGTTGTGTTGGTTTGTGACCATCTGCCCATTCTTCCTGTTATGACAGAGCTTGTGAAC 
TCCTTTTGTCCTTGAGGTGGGACCACGGCACTTAACGTCTCGACAACACAACCAAACACTGGTAGACGGGTAAGAAGGACAATACTGTCTCGAACACTTG 

410 420 430 440 450 460 470 480 490 500 

TTTAACTGGGACTGGGGCAAAGTCAATCCCACCTTTATACAATGAATTGCTGAAGAGGCCTTTTAAAACTTGGAGTGTGCATTGTTTATGGAAGGGCTTT 
AAATTGACCCTGACCCCGTTTCAGTTAGGGTGGAAATATGTTACTTAACGACTTCTCCGGAAAATTTTGAACCTCACACGTAACAAATACCTTCCCGAAA 
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GGTACCAAAGCATAGAACTACAGATCCGCTCTCTGCCTGTACCACCCTCTGGCATTTAATCACACAATGCTTGGTTTTGTTCTTCAACTTTTCCTGTTAT 
CCATGGTTTCGTATCTTGATGTCTAGGCGAGAGACGGACAT^ 

110 120 130 140 150 160 170 180 190 200 

GATGCAGTCCCTGGCTTGTGTAACTATGAGCTTCAAAAGCAAAGAACGCATCATCTATTTTTGTGTCTCTTCTTCCAAGGACTTAGTGTATCACTTACTG 
CTACGTCAGGGACCGAACACATTGATACTCGAAGTTTTCGTTTCTTGCGTAGTAGATAAAAACACAGAGAAGAAGGTTCCTGAATCACATAGTGAATGAC 

210 220 230 240 250 260 270 280 290 300 

GCTAAATGC T TGAGAC AAAAACAGGGATTAATGAAGAAGAAAGAGAAAGAAAAGGAAGGGAAAGTGC CCAC AATTACTGAC AGGGTTTC AGTAAAGCAGT 
CGATTTACGAACTCTGTTTTTGTCCCTAATTACTTCTTCTTTCTCTTTCTTTTCCTTCCCTTTCACGGGTGTTAATGACTGTCCCAAAGTCATTTCGTCA 

310 320 330 340 350 360 370 380 390 400 

CTAGAGGGTCAGGTATTTTCCATAGCCATGCCCCAGAGTGGGTGTTGCCACTTTAGCTGCCCTGGTCTGGCTGAAGGCCAGGACTTGATTGTTGATGGCC 
GATCTCCCAGTCCATAAAAGGTATCGGTACGGGGTCTCACCCACAACGGTGAAATCGACGGGACCAGACCGACTTCCGGTCCTGAACTAACAACTACCGG 

410 420 430 440 450 460 470 480 490 500 

CTTCCTTTGCTGCTAGTCACTGTTAAGTACTGC^GATTTACAGAAAGCTTCATGGAGGTCTGTAAGAAGCCAGAGGTGATAACACCAAGATTTAGAGCCA 
GAAGGAAACGACGATCAGTGACAATTCATGACGTCTAAATGTCTTTCGAAGTACCTCCAGACATTCTTCGGTCTCCACTATTGTGGTTCTAAATCTCGGT 

510 520 530 540 550 560 570 580 590 600 

CTGACCAGCAGAATGCAGAATGTCCAGGCTATGATCCAGGTTGTAGATCCTGATCTGACTACTCAAGACTGGTTGAAGGCAAGGTTCACTTGGATTCACT 
GACTGGTCGTCTTACGTCTTACAGGTCCGATACTAGGTCCAACATCTAGGACTAGACTGATGAGTTCTGACCAACTTCCGTTCCAAGTGAACCTAAGTGA 

610 620 630 640 650 660 670 680 690 700 

CTATTTGCCAGCAGATGTTTTAAATCCATCATATATATATATATATCTCCATTACTTTAGGACAGTGGTTCTCAGCCTTCCTAATGCTGTAGCCCTTTAA 
GATAAACGGTCGTCTACAAAATTTAGGTAGTATATATATATATATAGAGGTAATGAAATCCTGTCACCAAGAGTCGGAAGGATTACGACATCGGGAAATT 

710 720 730 740 750 760 770 780 790 800 

TAGAGTTCCTCATATTGTGATTGTAAAAATTATTTTGTTCCTACTTCATGACTAATTTTGCTACTGTGAAAGGGTCATTTTACCCCAGGCTGTTGAGACC 
ATCTCAAGGAGTATAACACTAACATTTTTAATAAAACAACGATGAAGTACTGATTAAAACGATGACACTTTCCCAGTAAAATGGGGTCCGACAACTCTGG 

810 820 830 840 850 860 870 880 890 900 

CACATGTTGGGAACCACTACTTTAGAAGGCATTGGGGTTGGAGAAGAAC 

GTGTACAACCCTTGGTGATGAAATCTTCCGTAACCCCAACCTCTTCTTGTACTTCTTATCTCATTGTCACCAGTCAAAACCAAGTAATATAGTGTCTTTG 

910 920 930 940 950 960 970 980 990 1000 

ATTCACTTTAAGGTTTCAGCATGTTTGTTGTGTATATGTGATTGTGTAAAGACTTCACCAGGTCTTTCTTTAATCACCATACCTAACATCTTCACCACTC 
TAAGTGAAATTCCAAAGTCGTACAAACAACACATATACACTAACACATTTCTGAAGTGGTCCAGAAAGAAATTAGTGGTATGGATTGTAGAAGTGGTGAG 

1010 1020 1030 1040 1050 1060 1070 1080 1090 1100 

CATATCCATCAGCTTCACCTTGTACTCTAGCATTTGGGCATTCATCCTGTACCAGGGCAGGCATCCATTCTTTTGCAACTCACATTGTTTCCTAGTTTTG 
GTATAGGTAGTCGAAGTGGAACATGAGATCGTAAACCCGTAAGTAGGACATGGTCCCGTCCGTAGGTAAGAAAACGTTGAGTGTAACAAAGGATCAAAAC 

1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 

ATTATTACCAACAATGCTTCTAGACCATGAATTTTGGTC 

TAATAATGGTTGTTACGAAGATCTGGTACTTAAAACCAGAAACTGAAAACGAACCATTTGTAGTATTTTGTTAGGTCACCACCACCACCACGGCGACGAC 

1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 

CTGGTGGTGGTGGTAAAGCAGGAAGCCATAAAGTGCCTTTATTCAATCTGTATTTGATACAAATTGTTATTTCTTCCCATGTAAAAGATATGGCATCTGA 
GACCACCACCACCATTTCGTCCTTCGGTATTTCACGGAAATAAGTTAGACATAAACTATGTTTAACAATAAAGAAGGGTACATTTTCTATACCGTAGACT 

1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 

AGTGTAGAGGTCTGAATTCAAACCTCACATCACCAGATAGTATATTACAGACTCAACAAATAATACACGGCTTTGCCTGACTTCAAAGCCCTGTTCTTGA 
TCACATCTCCAGACTTAAGTTTGGAGTGTAGTGGTCTATCATATAATGTCTGAGTTGTTTATTATGTGCCGAAACGGACTGAAGTTTCGGGACAAGAACT 

1410 1420 1430 1440 1450 1460 1470 1480 1490 1500 

CGTAAGTATATGAGTAACAATGGTAGCACCTTAGTTTTTATCAGTTCACTAAATATTTATATAAGACCTACTATGAAGGGAGATAGAAGGGTATGAGGTG 
GCATTCATATACTCATTGTTACCATCGTGGAATCAAAAATAGTCAAGTGATTTATAAATATATTCTGGAW3ATACTTCCCTCTATCTTCCCATACTCCAC 

1510 1520 1530 1540 1550 1560 1570 1580 1590 1600 

GGGTC ATGGGAATAGG AAAACGG TGGAAGGGAGAAGGAGAATTAACAAAAGCTAAT T ATGT TTG AAAATGCCAC AATGAAAC C TAAT T T ACAAAAGAAC C 
CCCAGTACCCTTATCCTTTTGCCACCTTCCCTCTTCCTCTTAATTGTTTTCGATTAATACAAACTTTTACGGTGTTACTTTGGATTAAATGTTTTCTTGG 

1610 1620 1630 1640 1650 1660 1670 1680 1690 1700 

ACTATATGACCTTCACAGTGTGTGCTAAGTCTTGGAGATTTAGTGGTGAAGAAGTCAGGTGTGTTTCCAATCTCATGGAGGATGTAATCAGTTAGAGAGC 
TGATATACTGGAAGTGTCACACACGATTCAGAACCTCTAAATCACCACTTCTTCAGTCCACACAAAGGTTAGAGTACCTCCTACATTAGTCAATCTCTCG 

1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 

ACAGGAGCACATAAAAAGATAGGCAAAAATGTATGATTAGTACCAT^ 

TGTCCTCGTGTATTTTTCTATCCGTTTTTACATACTAATCATGGTACATTCTATACTTCCCCTTGTGTCCTTTGATCACCCCTCTGGATTAAATCAAACT 

1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 

GTGGTCTTC AAAGACC CTTT AGAAGCTGAGAACTAAAGACAGC AAGCAAGGTGAGGGCAGC ATCTCC AC CTTTCC AGTGGAATGAGCAAC TTAGGGTATA 
CACCAGAAGT TTCTGGGAAATC T TCGAC T CTTGAT TTCTGTCGTTCGTTCCACTCC CGTCGTAGAGGTGGAAAGGTCACCTTACTCGTTGAATCCCATAT 

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 

CAGCTGATTCCCACATTGTCAACAAGGCTCTTCAGAGACTAGAGATGCACTAATGATGACCATACCCAGCTTTTAAGGAAGGTTTCTGAGCATGTCCAAG 
GTCGACTAAGGGTGTAACAGTTGTTCCGAGAAGTCTCTGATCTCTACGTGATTACTACTGGTATGGGTCGAAAATTCCTTCCAAAGACTCGTACAGGTTC 

2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 

CACCCTACACTAGGCATTGGAAATCAACATGTCCAGAGATGGAAGTGACAGTCAGTAAGCCAACCCTTTTCAAAACTTCCAAAGCTATTACTCGTCAACT ^ # A E <4 

GTGGGATGTGATCCGTAACCTTTAGTTGTACAGGTCTCTACCTTCACTGTCAGTCATTCGGTTGGGAAAAGTTTTGAAGGTTTCGATAATGAGCAGTTGA \ \\D * 7 0™/ 

2110 2120 2130 2140 2150 2160 2170 2180 2190 2200 

CTCCAGACATATGGGCCCCGAGTGTGTTGGGAAGCTCTCATTATTGTTCTTTGATTGGTTCTCTACATTCCGAGATCCAAGGAGCAGTTATCTCAGGTAG 
GAGGTCTGTATACCCGGGGCTCACACAACCCTTCGAGAGTAATAACAAGAAACTAACCAAGAGATGTAAGGCTCTAGGTTCCTCGTCAATAGAGTCCATC 



2210 2220 2230 2240 2250 2260 2270 2280 2290 2300 

AGGATCGTGGAATGTCTGCCCATGATTAACTTCAATTTATACCTGTAAGTTATACCACATCCTAAACACGCTGATGTCCCAGAGAACATTTTGACCAGCT 
TCCTAGCACCTTACAGACGGGTACTAATTGAAGTTAAATATGGACATTCAATATGGTGTAGGATTTGTGCGACTACAGGGTCTCTTGTAAAACTGGTCGA 

2310 2320 2330 2340 2350 2360 2370 2380 2390 2400 

GCTAACAAAACCCAGGAGCATTTAGAAAAAAACTGAGTCACCCACCGTTC 

CGATTGTTTTGGGTCCTCGTAAATCTTTTTTTGACTCAGTGGGTGGCAAGACCTATTACTACCTCTCTTTGTTTACCCTAATAAGAATGTCTCATACTTT 



2410 2420 2430 2440 2450 2460 2470 2480 2490 2500 

GTTACATAATTTTCCTGGATAATGGAGAATTAATTAAA 
CAATGTATTAAAAGGACCTATTACCTCTTAATTAATTTGTAGTC 

2510 2520 2530 2540 2550 2560 2570 2580 2590 2600 

GGAGGAGGAAAGAATTTGACTACTATTTGGGGGTTAACAATACATCTTACTAGCATGGCAAAGGAAACTGGGCTGCTTTTCAGAGTAAGCCACCCCAGTA 
CCTCC TCCTTTC TTAAACTGATGATAAACCCC CAATTGTTATGTAGAATGATCGTACCGTTTCCTTTGACCCGACGAAAAGTCTCATTCGGTGGGGTCAT 

2610 2620 2630 2640 2650 2660 2670 2680 2690 2700 

GATGCTGCAAGGCTGTGCTTTCATCCCAGGAGAAAGTCAACAGGGCCAGGCATGCCAGAACATGCCCATAATGTAACCACTTAGGCTGAGGCAGAAAGAT 
CTACGACGTTCCGACACGAAAGTAGGGTCCTCTTTCAGTTGTCCCGGTCCGTACGGTCTTGTACGGGTATTACATTGGTGAATCCGACTCCGTCTTTCTA 

2710 2720 2730 2740 2750 2760 2770 2780 2790 2800 

CAAAAATCCCAGGCCAGCTTAGTTTGTGTAACAAGACCTTTC 

GTTTTTAGGGTCCGGTCGAATCAAACACATTGTTCTGGAAACGAGTTTGTTTCTAAATGTTTTGTTTGTTCGTTTGTTTGTTTATATTTTTTCCTCTTCT 

2810 2820 2830 2840 2850 2860 2870 2880 2890 2900 

AAATAACTGCCAGGGGAGGCTGTGAGCAATGAAGACTTGATGAGTGAC^ 

TTTATTGACGGTCCCCTCCGACACTCGTTACTTCTGAACTACTCACTGGTAGAGCGTGTCACCTGCGAACACAGATCTTCCATTCCCGAACCGTTACAAA 

2910 2920 2930 2940 2950 2960 2970 2980 2990 3000 

CCCAGGTTTTCCATTCCTGGTTTATATGGCTTGAGGCCAGTGGACTTCACAATGTCTCAGCTTCCAGGTCTTTATACAGAGCATATTAGCCACATGTGGT 
GGGTCCAAAAGGTAAGGACCAAATATACCGAACTCCGGTCACCTGAAGTGTTACAGAGTCGAAGGTCCAGAAATATGTCTCGTATAATCGGTGTACACCA 

3010 3020 3030 3040 3050 3060 3070 3080 3090 3100 

AGCTTGTGCCTGTAATGCTGGCACTTGAGAGACCAAGACAGGAGGATTGCCACAAGTCTCCATCCAGCCTAGGTGCTGTGTCACTCTGTCTCACCCCTGA 
TCGAACACGGACATTACGACCGTGAACTCTCTGGTTCTGTCCTCCTAACGGTGTTCAGAGGTAGGTCGGATCCACGACACAGTGAGACAGAGTGGGGACT 

3110 3120 3130 3140 3150 3160 3170 3180 3190 3200 

CCCAGTCCCACCCAACATC^^CAGGCTATCACTGTGACACTGGTACTGAGTCAGAATCACCCAGATTAAAGATTCTGGGAGATCAGTCCTGGGGATGCG 
GGGTCAGGGTGGGTTGTAGTTTGTCCGATAGTGACACTGTGACCATGACTCAGTCTTAGTGGGTCTAATTTCTAAGACCCTCTAGTCAGGACCCCTACGC 

3210 3220 3230 3240 3250 3260 3270 3280 3290 3300 

GGAAGTGAGACCAGTTATTTAATAATTCTTATACTCATGAGATGATGGATCCAGATGAGAAATTGTAAAAATTTTAGGTTTTATAATTGAAGAAATAGGT 
CCTTCACTCTGGTCAATAAATTATTAAGAATATGAGTACTCTACTACCTAGGTCTACTCTTTAACATTTTTAAAATCCAAAATATTAACTTCTTTATCCA 

3310 3320 3330 3340 3350 3360 3370 3380 3390 3400 

GGTTTCTTCAGGTTACATCTCTCCACTGTTGGTCATTTCAGCTAAGGTCACTCCCCATTGATTCCTGTGAGGCTCTCACATCCCAGGTCTCTGGGACTTT 
CCAAAGAAGTCCAATGTAGAGAGGTGACAACCAGTAAAGTCGATTCCAGTGAGGGGTAACTAAGGACACTCCGAGAGTGTAGGGTCCAGAGACCCTGAAA 

3410 3420 3430 3440 3450 3460 3470 3480 3490 3500 

CTAGAGGTTCCCGCTGCTTCCCAGCCCTGAAAATGCGTATTTCTATTCATTCTCCTGGCATTCTGGGCTTCTCTCCTGTCCCCCGCCCCACCCAACACCT 
GATCTCCAAGGGCGACGAAGGGTCGGGACTTTTACGCATAAAGATAAGTAAGAGGACCGTAAGACCCGAAGAGAGGACAGGGGGCGGGGTGGGTTGTGGA 

3510 3520 3530 3540 3550 3560 3570 3580 3590 3600 

GATCCTGCCCCCTTTCTCTCCCCCTTCTCTCTCTAAACCAGGTCCCTCCCTCCCTCTGCTTCCCATGATTATTTTGTTCCCTCCTCTAAATGAGTCTGAA 
CTAGGACGGGGGAAAGAGAGGGGGAAGAGAGAGATTTGGTCCAGGGAGGGAGGGAGACGAAGGGTACTAATAAAACAAGGGAGGAGATTTACTCAGACTT 

3610 3620 3630 3640 3650 3660 3670 3680 3690 3700 

GCATCCTCACTTGGACNTTCCTTC T TGTTAAACTTC ATATGGTC TGTGAGTTGTATCATGGGTATTCTGTACTTTTTTGGCTAATGTTTCACTTATCAGT 
CGTAGGAGTGAACCTGNAAGGAAGAACAATTTGAAGTATACCAGACACTCAACATAGTACCCATAAGACATGAAAAAACCGATTACAAAGTGAATAGTCA 

3710 3720 3730 3740 3750 3760 3770 3780 3790 3800 

GAGTGCAAACCAGGCATATCCTTTTGAGTTTGGGTTACCTCACTCAGGATGATATTTTCTAGTTCTATCCATTCGCCTGCAAAATTCATGATGTCCTAAT 
CTCACGTTTGGTCCGTATAGGAAAACTCAAACCCAATGGAGTGAGTCCTACTATAAAAGATCAAGATAGGTAAGCGGACGTTTTAAGTACTACAGGATTA 

3810 3820 3830 3840 3850 3860 3870 3880 3890 3900 

TTTTAGTAGCTGAATAGTATTCCATTGTGTAAATGAACCATATTTTCTGCATCTGTTCTTCAGCTGAGGGAAATCTGGGTTGTTTCCAGCTTCTAGGTAT 
AAAATCATCGACTTATCATAAGGTAACACATTTACTTGGTATAAAAGACGTAGACAAGAAGTCGACTCCCTTTAGACCCAACAAAGGTCGAAGATCCATA 

3910 3920 3930 3940 3950 3960 3970 3980 3990 4000 

TATAAATAAGGTTGCTATGAACATAGTGGAACACATATCCTTGAGGTATGGTAGAGCATCTTTTGGGTATATATCCAGGAGTGGATAGTTGGGTTTTCAG 
ATATTTATTCCAACGATACTTGTATCACCTTGTGTATAGGAACTCC ATACCATCTCGTAGAAAAC C CATATATAGGTCC TCACCTATCAACCCAAAAGTC 

4010 4020 4030 4040 4050 4060 4070 4080 4090 4100 

GTAGAACTATTTCCAATTTTCTAAGGAACCACCAGATTGATTTTTAGATAGACAGGGCCCCTAGTGGAGAGATGGGGCCAAACACCTACCTTCAAAAATT 
CATCTTGATAAAGGTTAAAAGATTCCTTGGTGGTCTAACTAAAAATCTATCTGTCCCGGGGATCACCTCTCTACCCCGGTTTGTGGATGGAAGTTTTTAA 

4110 4120 4130 4140 4150 4160 4170 4180 4190 4200 

TGGTCCAGAATTGTTCCTCTCTAAAAGAAATGCAGGGACAAAAATGAAACAGAGACTGACCAACCCAACTTAGGATCCATCCTATGGGCAAGCACCAAAC 
ACCAGGTCTTAACAAGGAGAGATTTTCTTTACGTCCCTGTTTTTACTTTGTCTCTGACTGGTTGGGTTGAATCCTAGGTAGGATACCCGTTCGTGGTTTG 

4210 4220 4230 4240 4250 4260 4270 4280 4290 4300 r"//^ A £T O 

CCAGACTCTATTATTGATGCCATGTTGTGCTTGCAGACAGGAGCTTAGCATGGCTGTCCTCTGAGACACTCTATCAGCAGCTGACTGGGACAGATGCAGA § / \J m / O 
GGTCTGAGATAATAACTACGGTACAACACGAACGTCTGTCCTCGAATCGTACCGACAGGAGACTCTGTGAGATAGTCGTCGACTGACCCTGTCTACGTCT 

4310 4320 4330 4340 4350 4360 4370 4380 4390 4400 

TGC CAACCC TTGAACTGAGGTCCAGGACCCCTATGGAAGAATTAGGGGAAGGTTTGAAGGAGC TGAAGGGGATGGCAACCCC ATAGGAAAAACAAGTGTC 
ACGGTTGGGAACTTGACTCCAGGTCCTGGGGATACCTTCTTAATCCCCTTCCAAACTTCCTCGACTTCCCCTACCGTTGGGGTATCCTTTTTGTTCACAG 



4410 4420 4430 4440 4450 4460 4470 4480 4490 4500 

AACTAACCCTCAGAGCTCCCAGAGACTAAGCCACCAACTAAAGAGCATACATGGGCTG^ 
TTGATTGGGAGTCTCGAGGGTCTCTGATTCGGTGGTTG 

4510 4520 4530 4540 4550 4560 4570 4580 4590 4600 

AGGAGAGGATGTGCCTAATCCTCTAGAGACTTGATGCCCCAGGGAAGGG^^ 

TCCTCTCCTACACGGATTAGGAGATCTCTGAACTACGGGGTCCCTTCCCCTGTTCCTCCCCTGTTCCACCCCTAACCACACCCCATCACCCCCAACCCCC 

4610 4620 4630 4640 4650 4660 4670 4680 4690 4700 

TGGGGGTGGGGATGTGAATGGGTGAGTGAGGGAGGGAATGAGTGAGT 

ACCCCCACCCCTACACTTACCC^CTCACTCCCTCCCTTACTCACTCACCCACCATGTCGTAGGAGAGTCTCCGTTTCCCCTTCCCCTCACCTATTGTTTG 

4710 4720 4730 4740 4750 4760 4770 4780 4790 4800 

TCTGGGAGCAGGGACGGGGAAGGAGGGCAACATTTGTAATTAAAT 

AGACCCTCGTCCCTGCCCCTTCCTCCCGTTGTAAACATTAATTTATTTATTTTATTAAATTATTTTTTTTACTTCTTTGTCC'TATTGAACCCTTACCAAT 

4810 4820 4830 4840 4850 4860 4870 4880 4890 4900 

CAGCAGGGCTGGGATTAGAACCCAAAAAGTTTATTCTGAGACTCTTTTCCAATACCAAGCTTAAAGTTTTCTTCAGAATTCTATAGAATGCCTTTTTGGC 
GTCGTCCCGACCCTAATCTTGGGTTTTTCAAATAAGACTCTGAGAAAAGGTTATGGTTCGAATTTCAAAAGAAGTCTTAAGATATCTTACGGAAAAACCG 

4910 4920 4930 4940 4950 4960 4970 4980 4990 5000 

AGAAGTTCTTTGGACTTTAATAAAGAACATATTGAAGAGATGAAAAGAAGCTTACTAAGATCTAATGAAAATCAAGATGCTAGGCACAGTGCCAGATACT 
TCTTCAAGAAACCTGAAATTATTTCTTGTATAACTTCTCTACTTTTCTTCGAATGATTCTAGATTACTTTTAGTTCTACGATCCGTGTCACGGTCTATGA 

5010 5020 5030 5040 5050 5060 5070 5080 5090 5100 

TTAACATAGTAATATGACTCTTTAGAGTTTTGAGACAGGGCCTCATATAGTTTATGATGAATTCACTGTTTTGTCAAAGATGACCTTGAACTCTTAATCC 
AATTGTATCATTATACTGAGAAATCTCAAAACTCTGTC 

5110 5120 5130 5140 5150 5160 5170 5180 5190 5200 

ATTCCCAAAGTGTTGTTGTCATATGTTTGCACCACTCCTGGCTTCATAGTGTTTTTAAAACACCCATGGAGAGTCGGGTGTGAAGATCCACACGTCTAAC 
TAAGGGTTTCACAACAACAGTATACAAACGTGGTGAGGACCGAAGTATCACAAAAATTTTGTGGGTACCTCTCAGCCCACACTTCTAGGTGTGCAGATTG 

5210 5220 5230 5240 5250 5260 5270 5280 5290 5300 

CTCAGCATCTGGTGAATCAAGGCAGGAGGGCGGGTGGTTGCAG^ 

GAGTCGTAGACCACTTAGTTCCGTCCTCCCGCCCACCAACGTCCGACCGATATTATAGATTCAAAGTCAATCATTCCCGACGTATTACTTTGTGACAGAA 

5310 5320 5330 5340 5350 5360 5370 5380 5390 5400 

AAACACAAAACCAAAACCCATGAAGGAGATACTATTGCCATTTAAAAGTC 

TTTGTGTTTTGGTTTTGGGTACTTCCTCTATGATAACGGTAAATTTTCAGAGACCTTACCTTTATCGATAGTATTAGAATGGAGACTCGGTCACAGACGG 

5410 5420 5430 5440 5450 5460 5470 5480 5490 5500 

CTCAGGTGTGCCTGAGGACTGAACAGGGCTATGCACTCCTCAGGTTGGAAACATTACTAGTCCTCAGTGTCTGCTCTTGACCTGTTAACAGCTGAGTCAG 
GAGTCCACACGGACTCCTGACTTGTCCCGATACGTGAGGAGTCCAACCTTTGTAATGATCAGGAGTCACAGACGAGAACTGGACAATTGTCGACTCAGTC 

5510 5520 5530 5540 5550 5560 5570 5580 5590 5600 

GGTCTGCCCTCAGCTGTGCCTGAGGACAGAGCTGAGCTATCTACCCCTGCAGATTGGAAGCATTACAGGCACTCAAGATCAGCCCTGAAGTGATAAAACC 
CCAGACGGGAGTCGAC ACGGACTCC TGTC TCGAC TCGATAGATGGGGACGTCTAACCTTCGTAATGTC CGTGAGTTCTAGTCGGGACTTC ACTATTTTGG 

5610 5620 5630 5640 5650 5660 5670 5680 5690 5700 

TAAGGCAGAAATCCACC^GACTAGCAGTGCCTCCGTGTCTCTTCCTGT 

ATTCCGTCTTTAGGTGGTTCTGAT CGTCACGGAGGC AC AGAGAAGGACACCGACCACCCTTTCTCTC CCCGTC AGGAAGGAAC TACGTTCCAGCACAC AG 

5710 5720 5730 5740 5750 5760 5770 5780 5790 5800 

TAGTGGCACGCTTCCTTCATTCCCAGTGAGAGCAAGTGATCACCTGGGTAAGGAAGGTTCAGGTGCCTGAGCTCGCTGGAGAATTCATCACTCATCCATC 
ATCACCGTGCGAAGGAAGTAAGGGTCACTCTCGTTCACTAGTGGACCCATTCCTTCCAAGTCCACGGACTCGAGCGACCTCTTAAGTAGTGAGTAGGTAG 

5810 5820 5830 5840 5850 5860 5870 5880 5890 5900 

ACTCTGCTCCTGTAGACATAATCACTTCTGTTGGGTCTTTATAGAGATGATTTATAACTTTGTTGTTTATAGTTTTTATGAATGTGTGTATTCATTTAGG 
TGAGACGAGGACATCTGTATTAGTGAAGACAACCCAGAAATATCTCTACTAAATATTGAA^ 

5910 5920 5930 5940 5950 5960 5970 5980 5990 6000 

TCACATGGGAGGTACACATTTTCAGGTGTCTGTCTTTCCATCACACGGGCTTTGAATTAAACTCAGTCTTGGTTTTACCGGCTGAGCCATCTCACCTGCC 
AGTGTACCCTCCATGTGTAAAAGTCCACAGACAGAAAGGTAGTGTGCCCGAAACTTAATTTGAGTCAGAACCAAAATGGCCGACTCGGTAGAGTGGACGG 

6010 6020 6030 6040 6050 6060 6070 6080 6090 6100 

TGATTATTTAAAAATCTCCGGAGTAATCCAGGAGTGTGGTTTATGATTGTAGTATCAACACTCGGGAGGCTGAGGGAGCATCGTTATCATGAGCTCCAGG 
ACTAATAAATTTTTAGAGGCCTCATTAGGTCCTCACACCAAATACTAACATCATAGTTGTGAGCCCTCCGACTCCCTCGTAGCAATAGTACTCGAGGTCC 

6110 6120 6130 6140 6150 6160 6170 6180 6190 6200 

CTAGTTC CAGGCTTGCCTAAGCTGTAGAGC AAGTC AC TCTCTTAAAAAGTGC CTCTCC C ATATTTTTGTATATAATTTGC ATCTGAAATTCTGTTTGCCA 
GATCAAGGTCCGAACGGATTCGACATCTCGTTCAGTGAGAGAATTTTTCACGGAGAGGGTATAAAAACATATATTAAACGTAGACTTTAAGACAAACGGT 

6210 6220 6230 6240 6250 6260 6270 6280 6290 6300 

ATAACTATGAAATTATTC ACATTACTAAAATCTTCC TGTGCC AAGTTCTCC AACGAAT TAGATCAC ACTCAGATGAAATGCTAATAAAAATTAAAGC TGT 
TATTGATACTTTAATAAGTGTAATGATTTTAGAAGGACACGGTTCAAGAGGTTGCTTAATCTAGTGTGAGTCTACTTTACGATTATTTTTAATTTCGACA 

6310 6320 6330 6340 6350 6360 6370 6380 6390 6400 

AGCCAGTAGCATGCGTATATTTGGGCTCAGGGCCAACAGGCAGGCGATCTGGGTGTAAGAAAATAGGCTAATGGCTGTGGAATCTGGTCTCTAGTGGCTC 
TCGGTCATCGTACGCATATAAACCCGAGTCCCGGTTGTCCGTCCGCTAGACCCACATTCTTTTATCCGATTACCGACACCTTAGACCAGAGATCACCGAG 

6410 6420 6430 6440 6450 6460 6470 6480 6490 6500 FIQ 1 5~3 

CGCTGAGAGCTGAC C TCAACCACGC TCCC TCAAATTGATTGCCTTCCAGGTTATGATTTCTCATCACAGGAAAC TTTGTTGCCCAATTCAAACCC TGTGA " 
GCGACTCTCGACTGGAGTTGGTGCGAGGGAGTTTAACTAACGGAAGGTCCAATACTAAAGAGTAGTGTCCTTTGAAACAACGGGTTAAGTTTGGGACACT 



6510 6520 6530 6540 6550 6560 6570 6580 6590 6600 

GTGAAAACAAAAACAGGAGAGCAAGTGCTGCTCCCCGTGCCCCAAAGCCCCTTCTGTCAGGGATCCCAAATGCACCCCAGAGAACAGCTTAGCCTGCAAG 
CACTTTTGTTTTTGTCCTCTCGTTCACGACGAGGGGCACGGGGTTTCGGGGAAGACAGTCCCTAGGGTTTACGTGGGGTCTCTTGTCGAATCGGACGTTC 



6610 6620 6630 6640 6650 6660 6670 6680 6690 6700 

GGCTGGTCCTCATCGCATACCATACATAGGTCGAGGGCTTGTTAT^ 
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SEQUENCE LISTING 
<110> Zhang, Ning and Anthony Purchio 

<12 0> METHODS AND COMPOSITIONS FOR SCREENING FOR ANGIOGENES I S 
MODULATING COMPOUNDS 

<130> 9400-0003.20 

<140> 

<141> 1999-12-16 

<150> 60/152,522 
<151> 1999-09-03 

<160> 51 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer PGKF 
<400> 1 

atcgaattct accgggtagg ggaggcgctt t 

<210> 2 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer PGKR 
<400> 2 

ggctgcaggt cgaaaggccc ggagatgagg 

<210> 3 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer NeoF 
<400> 3 

acctgcagcc aatatgggat cggccattga ac 

<210> 4 
<211> 37 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: primer NeoR 
<400> 4 

ggatccgcgg ccgcccccag ctggttcttt ccgcctc 

<210> 5 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TKF 
<400> 5 

ggatcctcta gagtcgagca gtgtggtttt 

<210> 6 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TKR 
<400> 6 

gagctcccgt agtcaggttt agttcgtccg 

<210> 7 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F5R51 
<400> 7 

gtacatttaa atcctgcagg 

<210> 8 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F5R52 
<400> 8 

agctcctgca ggatttaaat 

<210> 9 

<211> 77 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F3R31 



<400> 9 
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ggcccgggct taattaatgc atcatatggt accgtttaaa cgcggccgca agcttgtcga 6 
cggcgcgccg gccggcc 7 

<210> 10 
<211> 77 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F3R32 
<400> 10 

gatcggccgg ccggcgcgcc gtcgacaagc ttgcggccgc gtttaaacgg taccatatga 6 
tgcattaatt aagcccg 

<210> 11 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VN1R 
<400> 11 

ctgtatttaa atctgcccac cctattcagg acagtagtc 

<210> 12 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VN1F 
<400> 12 

ccaatgcatc aacccagcca ggaggagtgc g 

<210> 13 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VN2R 
<400> 13 

aacgcgtcga cttcggagat gtttcgggga taaccagg 

<210> 14 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VN2F 
<400> 14 

ttggcgcgcc ccatagagaa gagacaccaa aggcacgctc 
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<210> 15 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer FosBlF 
<400> 15 

ctgtatttaa atcccgtttc tcactgtgcc tgtgtc 

<210> 16 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer FosBlR 
<400> 16 

gtctcctgca ggcttcctcc tccttgttcc ttgcg 

<210> 17 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer FosB2F 
<400> 17 

aacgcgtcga cggatgggat tgacccccag ccctc 

<210> 18 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer FosB2R 
<400> 18 

ttggcgcgcc ccttgcctcc acctctcaaa tgc 

<210> 19 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VF1 
<400> 19 

acctcactct cctgtctccc ctgattccca a 

<210> 20 
<211> 25 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VR1A 
<400> 20 

gctctggcgg tcacccccaa aagca 

<210> 21 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VF2 
<400> 21 

ccctttccaa gacccgtgcc atttgagc 

<210> 22 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VR2 
<400> 22 

actttgcccc tgtccctctc tctgttcgc 

<210> 23 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer KF1 
<400> 23 

gctgcgtcca gatttgctct cagatgcg 

<210> 24 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer KR1 
<400> 24 

ttctcaggca cagactcctt ctccgtccct 

<210> 25 
<211> 32 
<212> DNA 

<213> Artificial Sequence 



PXE-012.US 



6 



<220> 

<223> Description of Artificial Sequence: primer KF2 
<400> 25 

cagatggacg agaaaacagt agaggcgttg gc 

<210> 26 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer KR2 
<400> 26 

gaggactcag ggcagaaaga gagcg 

<210> 27 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TF3 
<400> 27 

agcttagcct gcaagggtgg tcctcatcg 

<210> 28 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TF2 
<400> 28 

caaatgcacc ccagagaaca gcttagcctg c 

<210> 29 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TR1 
<400> 29 

gctttcaaca actcacaact ttgcgacttc ccg 

<210> 30 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VR2F 



PXE-012.US 



<400> 30 

cgctagtgtg tagccggcgc tctc 24 

<210> 31 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VR2R 
<400> 31 

ataagaatgc ggccgcctgc acctcgcgct gggcacag 38 

<210> 32 
<211> 4486 
<212> DNA 
<213> Mus sp. 

<400> 32 

aagcttgcag ggaggtagga ggcagcctgt ggcgttgatt caatgcacct ggccttatcc 60 
tcggatgaga tcggtcacca gtcaaaaact gtgagcttga aggtcttggg tgcttaacat 120 
ctatttttac aaatcttatt tagcaactta gaactgtgaa atattggaaa gctacttaaa 180 
ccttctaaac tccctcctcc acactatgag aatgttacat tttctattca gttatttttg 24 0 
agcagtaaac agatgaatca aggaatatgc ccatcacatc aagagtgctc ctaaatggac 300 
ttgcttgtta ttcatttaca gtgtggcccc ttgactttca tcggcactcc tagcagaaaa 3 60 
caaaatccgc cagatggagc tggagagatg gctcagctgt taagaatact tatccctaca 420 
caggccctgg agccagttcc cagcacccac acggtggctc acaaccatct gtaactccag 48 0 
ttctaggaga cccgactccc tcttctgtct gaaaacacca ggcacgcgtg cggctacata 540 
caaacatgaa agcaaaatac acacattaca taaataaatc ttaaaaaatg attcggggtg 600 
ggggaaggaa aaaaaaggat gttagaaaat cgatgtaact gttttttcct tttgcacaga 660 
tctaagtagg gaaggagaac attctcttac catcgagaat aattgttttc attgccccca 720 
agtctgctaa tagagcttgc taccttcatg gctgtcgtaa ggatgaggca aagatggact 780 
tcagctttca gactgtgtct gctcaaatgt tggctactcc tgttttctga cccccttctc 840 
tggtgcaatg tggactttca attaatttcc ctgcatcttt tacatatttg atttaaaaaa 900 
tattttattt tatgtaattg tatgtatatg catgtcaata agcatatgtg tgtgtgtttc 960 
catggaaacc aaggcaacag attctccaga gctgtagaaa tgggctgtga gacgcccact 1020 
gtgggtgctc ggaaccaaac tcgggtcctg tggaaagaca gcgagcaccc ataatgcaga 1080 
ggtatctctc agactctact ttaaaatttc aatttatctt tttttttttt aaagttccaa 1140 
gtaactatag gaaagtacat gggtatatag atccccagta ccaagattct tcctttgcag 1200 
gtagcacaac ttggtctgct tcacataaag aatggaaagt cattaaaaca ctcatcacac 1260 
tgtaaagtag aattgaactc tgacagaaca agcgaagtga gtctgacttc caggtaactg 1320 
agccttcttt tcctcctaaa gacacaagcc atacacagag taaaataaac ttgggcatgg 1380 
tgagaaggaa acaacgcagg agggctagcc aagtctgaga gtcgtgagtg tgctcggttt 1440 
ataaacggag cccaccttgc cagcgaggta gtcacatgct ctgctaaaca gaaacttaag 1500 
aaaacactta cacgaagcaa acatggggaa gtgccatgca agcatgtgac tgactggtgg 1560 
caatgaccga aaccacagca gccactagaa aaggaagggt agtgcgccac actgtagttg 1620 
tgaaaatgaa cttattcatt tattttgaaa aacgtgtaag aagcaaagat gtcttctttc 1680 
ccacctacct ttgcggcagg cgagcacttc ctggaattta taaagtgcga tctttctggg 1740 
gacttctcat aacatttcct actgctcatc tatgtctgtg tcaaatagag aatgctcttg 1800 
aacaagtgtg tgtgtgtgtg tgtgtgcgcg cgcacgcgca ctcactcctg ctctgttgag 1860 
gtccagtttt gatggtcccg ccagaggtat atttgagtat catttctcaa gagcttcagc 1920 
tgggagacac tgcctcttac tggcctgaag gtcactagct gattcatctc cgtttgggct 1980 
ggcgcgcctt ggggatcctc ctatctctcc ttccccagtg ctgggataac aaggttggca 2040 
ccacatgagc cttttaaaat gtgagtttgg aagctcaaac gcaggttttc atgcttgcac 2100 
tgaaacttca caagctgaac cgtctccctc tccttccctc tcttttttcc ttttcttctt 2160 
cctttttaaa acacatcttg tctttaaaaa aaaaaaaagg cccaaaacaa gtgtaaagta 2220 
tttccctatg tgtgtggagg gagggagtat aggaggctga tttcactgag atcctgttaa 2280 
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atttgggtgc catagccaat caaagacgca tcgtttcctc taagaattct aaatggggcg 2340 
attaccacgg gcctgcaggt tctggtttgt attagaggag acactgtctt cttaagtaaa 2400 
acatagaagg ggaagtgtcc agaattgtaa ataaggcttc gagagaagcc ttgtctggcc 2460 
accgggatgg agaagaccta ccttcgccta tccaggatcc atcgtccctc cctctaccca 2520 
gatctgacag ccctccttgg ctcttttgct gaggtttgtt tgagtttgtt ttactctctg 2580 
caagagaagt ttccttaaac attctaccct gttcacaagt aaatacacct cttagctaag 2640 
aggccacaca cccaggggaa caccgataaa aagaacaagc cagaaccttc agaacgctgt 2700 
cgataggtac accaagcagc cttcatacgg agttttcatt cgtgaggagc tgaatataca 2760 
acaaagctaa atgtgagcag accaggcatg cctctgctaa atgaggatgc ccacaccaaa 2820 
catgcccaag atcttcaagt ataattttat tatatagatt cgctatgtgt tgacatgttt 2880 
ttatagtgaa cctggatttt acaaaccctc ctggtttgcc acctgcttct ggcaccatac 2940 
ttgaggctta ggcacgtgat aaaggagcat gcctgtttcc ccccttattt tttttaaaga 3000 
aaagcaccat gttacatcat taatcatgca tatcagtgta gtttagatcc gatgtagaga 3060 
caataatctt atctctttgt ctggctgaaa gactgtcctt taaactatca ttctaaatgc 3120 
atttggtttt tgccaggagt aaaacatgtc acaagatatt tgttgtcatt tcccaggcgt 3180 
ggaaggaaag gaatggaaag aaaacgaggg gtgaaggctg ctgttcctct ctagtcgcta 3240 
cttgaagtct acatagctgg gggggggggg gggactgttc acatgggacc ggtttcctct 3300 
ttgttcctac actggcgcct ctggcaagaa actctccctt ctcttccccc caagcatatc 3360 
ttggctgaaa ggtcagctct gaaaaggggc ctggccaaag ttactgtagg ggaccgtggt 3420 
catggaactg ggtagacaaa agcactctag cagccactgg agaaggaccg ggggctcttc 3480 
tctgtgcatt tgccctggag ccctgaccac cgccagctcc ctgcatctcc ttgctatggg 3540 
ttttctggac cgagccaggc aggagttcac aaccgaaatg tcttctaggg ctaatcaggt 3600 
aacttcggac gatttaaagt tgccagatgg acgagaaaac agtagaggcg ttggcaacct 3660 
ggataagcgc ctatcttcta attaaaacat tcagacgggg cgggggatgc ggtggccaaa 3720 
gcaccataaa acaaaacttc caagtactga ccaactcact gcaagtttgt gccccgagta 3780 
catctaggtt caggggtctt gtcttcatgc tcccaactgc gggcggattt ttggtccctt 3840 
gggactttca gtgcagcggc gaagagagtt ctgcacttgc aggctcctaa tgagggcgca 3900 
gtgggcctcg tgtttctggt gatgcttccc aggttgctgg gggcagcaag tgtctcagag 3960 
cccattactg gctacatttt acttccacca gaaaccgagc tgcgtccaga tttgctctca 4020 
gatgcgactt gccgcccggc acagttccgg ggtagtgggg gagtgggcgt gggaaaccgg 4080 
gaaacccaaa cctggtatcc agtggggggc gtggccggac gcagggagtc cccacccctc 4140 
ccggtaatga ccccgccccc attcgctagt gtgtagccgg cgctctcttt ctgccctgag 4200 
tcctcaggac cccaagagag taagctgtgt ttccttagat cgcgcggacc gctacccggc 4260 
aggactgaaa gcccagactg tgtcccgcag ccgggataac ctggctgacc cgattccgcg 4320 
gacaccgctg cagccgcggc tggagccagg gcgccggtgc cccgcgctct ccccggtctt 4380 
gcgctgcggg ggcgcatacc gcctctgtga cttctttgcg ggccagggac ggagaaggag 4440 
tctgtgcctg agaactgggc tctgtgccca gcgcgaggtg cagatg 4486 

<210> 33 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VEF 
<400> 33 

acacgcctcg agaaatgtgc tgtctttaga agccactg 3 8 

<210> 34 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VER 



<400> 34 
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acacgcgtcg acgatccaat 



aggaaagccc 



ttccataaac 



40 



<210> 35 
<211> 511 
<212> DNA 
<213> Mus sp. 

<400> 35 

aaatgtgctg tctttagaag ccactgcctc^agcttctgca gctcagatac caaaggaagt 6 0 
ctggtacaca gcatgataaa agacaatggg asggggtcac agtggctccc gtccctttca 120 
9999tatgga gacgagctgt agagagatgt ct^cagggag ttttcattaa tcagcaattt 180 
agtcagatct gtgcatccta tgctttacaa gaaatgtcag tgggcctgag atcatcagat 240 
ggaggttcat cgggtttcaa tgtcccgtat ccttttgtaa gaccttgaag ttggcaacgc 300 
aggaaaacag gaactccacc ctggtgccgt gaattgcaga gctgttgtgt tggtttgtga 360 
ccatctgccc attcttcctg ttatgacaga gcttgtgaac tttaactggg actggggcaa 420 
agtcaatccc acctttatac aatgaattgc tgaagaggcc ttttaaaact tggagtgtgc 480 
attgtttatg gaagggcttt cctattggat c 511 

<210> 36 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
GL3B- Forward 



<210> 37 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; primer 
GL3B- Reverse 

<400> 37 

agctgcggcc gccccgggta ccaagcttaa ttaa 34 

<210> 38 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer T2 



<400> 36 

gtacttaatt aagcttggta cccggggcgg ccgc 



34 



Forward 



<400> 38 

tatcaacact cgggaggctg agggag 



26 



<210> 39 
<211> 41 
<212> DNA 



<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: primer T2 
Reverse 

<400> 39 

ataagaatgc ggccgcactt ccccagatct ccccatccag c 41 

<210> 40 
<211> 7093 
<212> DNA 
<213> Mus sp. 

<400> 40 

ggtaccaaag catagaacta cagatccgct ctctgcctgt accaccctct ggcatttaat 60 
cacacaatgc ttggttttgt tcttcaactt ttcctgttat gatgcagtcc ctggcttgtg 120 
taactatgag cttcaaaagc aaagaacgca tcatctattt ttgtgtctct tcttccaagg 180 
acttagtgta tcacttactg gctaaatgct tgagacaaaa acagggatta atgaagaaga 24 0 
aagagaaaga aaaggaaggg aaagtgccca caattactga cagggtttca gtaaagcagt 300 
ctagagggtc aggtattttc catagccatg ccccagagtg ggtgttgcca ctttagctgc 360 
cctggtctgg ctgaaggcca ggacttgatt gttgatggcc cttcctttgc tgctagtcac 420 
tgttaagtac tgcagattta cagaaagctt catggaggtc tgtaagaagc cagaggtgat 480 
aacaccaaga tttagagcca ctgaccagca gaatgcagaa tgtccaggct atgatccagg 540 
ttgtagatcc tgatctgact actcaagact ggttgaaggc aaggttcact tggattcact 600 
ctatttgcca gcagatgttt taaatccatc atatatatat atatatctcc attactttag 660 
gacagtggtt ctcagccttc ctaatgctgt agccctttaa tagagttcct catattgtga 720 
ttgtaaaaat tattttgttg ctacttcatg actaattttg ctactgtgaa agggtcattt 780 
taccccaggc tgttgagacc cacatgttgg gaaccactac tttagaaggc attggggttg 840 
gagaagaaca tgaagaatag agtaacagtg gtcagttttg gttcattata tcacagaaac 900 
attcacttta aggtttcagc atgtttgttg tgtatatgtg attgtgtaaa gacttcacca 960 
ggtctttctt taatcaccat acctaacatc ttcaccactc catatccatc agcttcacct 1020 
tgtactctag catttgggca ttcatcctgt accagggcag gcatccattc ttttgcaact 1080 
cacattgttt cctagttttg attattacca acaatgcttc tagaccatga attttggtct 1140 
ttgacttttg cttggtaaac atcataaaac aatccagtgg tggtggtggt gccgctgctg 1200 
ctggtggtgg tggtaaagca ggaagccata aagtgccttt attcaatctg tatttgatac 1260 
aaattgttat ttcttcccat gtaaaagata tggcatctga agtgtagagg tctgaattca 1320 
aacctcacat caccagatag tatattacag actcaacaaa taatacacgg ctttgcctga 1380 
cttcaaagcc ctgttcttga cgtaagtata tgagtaacaa tggtagcacc ttagttttta 1440 
tcagttcact aaatatttat ataagaccta ctatgaaggg agatagaagg gtatgaggtg 1500 
gggtcatggg aataggaaaa cggtggaagg gagaaggaga attaacaaaa gctaattatg 1560 
tttgaaaatg ccacaatgaa acctaattta caaaagaacc actatatgac cttcacagtg 162 0 
tgtgctaagt cttggagatt tagtggtgaa gaagtcaggt gtgtttccaa tctcatggag 1680 
gatgtaatca gttagagagc acaggagcac ataaaaagat aggcaaaaat gtatgattag 1740 
taccatgtaa gatatgaagg ggaacacagg aaactagtgg ggagacctaa tttagtttga 1800 
gtggtcttca aagacccttt agaagctgag aactaaagac agcaagcaag gtgagggcag 1860 
catctccacc tttccagtgg aatgagcaac ttagggtata cagctgattc ccacattgtc 1920 
aacaaggctc ttcagagact agagatgcac taatgatgac catacccagc ttttaaggaa 1980 
ggtttctgag catgtccaag caccctacac taggcattgg aaatcaacat gtccagagat 2040 
ggaagtgaca gtcagtaagc caaccctttt caaaacttcc aaagctatta ctcgtcaact 2100 
ctccagacat atgggccccg agtgtgttgg gaagctctca ttattgttct ttgattggtt 2160 
ctctacattc cgagatccaa ggagcagtta tctcaggtag aggatcgtgg aatgtctgcc 2220 
catgattaac ttcaatttat acctgtaagt tataccacat cctaaacacg ctgatgtccc 2280 
agagaacatt ttgaccagct gctaacaaaa cccaggagca tttagaaaaa aactgagtca 2340 
cccaccgttc tggataatga tggagagaaa caaatgggat tattcttaca gagtatgaaa 2400 
gttacataat tttcctggat aatggagaat taattaaaca tcagcatctt ttctggactg 2460 
cagagggaag acagaggtga agccaatctt tccgggaaat ggaggaggaa agaatttgac 2520 
tactatttgg gggttaacaa tacatcttac tagcatggca aaggaaactg ggctgctttt 2580 
cagagtaagc caccccagta gatgctgcaa ggctgtgctt tcatcccagg agaaagtcaa 264 0 
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cagggccagg catgccagaa catgcccata 
caaaaatccc aggccagctt agtttgtgta 
aaacaaacaa gcaaacaaac aaatataaaa 
tgtgagcaat gaagacttga tgagtgacca 
gtaagggctt ggcaatgttt cccaggtttt 
tggacttcac aatgtctcag cttccaggtc 
agcttgtgcc tgtaatgctg gcacttgaga 
catccagcct aggtgctgtg tcactctgtc 
aacaggctat cactgtgaca ctggtactga 
agatcagtcc tggggatgcg ggaagtgaga 
gatgatggat ccagatgaga aattgtaaaa 
ggtttcttca ggttacatct ctccactgtt 
attcctgtga ggctctcaca tcccaggtct 
ccagccctga aaatgcgtat ttctattcat 
ccccgcccca cccaacacct gatcctgccc 
ggtccctccc tccctctgct tcccatgatt 
gcatcctcac ttggacnttc cttcttgtta 
ggtattctgt acttttttgg ctaatgtttc 
cttttgagtt tgggttacct cactcaggat 
aaaattcatg atgtcctaat ttttagtagc 
tattttctgc atctgttctt cagctgaggg 
tataaataag gttgctatga acatagtgga 
ttttgggtat atatccagga gtggatagtt 
ctaaggaacc accagattga tttttagata 
aacacctacc ttcaaaaatt tggtccagaa 
aaaatgaaac agagactgac caacccaact 
ccagactcta ttattgatgc catgttgtgc 
ctgagacact ctatcagcag ctgactggga 
tccaggaccc ctatggaaga attaggggaa 
ccataggaaa aacaagtgtc aactaaccct 
aagagcatac atgggctggt ttgtggtccc 
aggagaggat gtgcctaatc ctctagagac 
gacaaggtgg ggattggtgt ggggtagtgg 
ggtgagtgag ggagggaatg agtgagtggg 
aaggggagtg gataacaaac tctgggagca 
taaataaata aaataattta ataaaaaaaa 
cagcagggct gggattagaa cccaaaaagt 
ttaaagtttt cttcagaatt ctatagaatg 
taaagaacat attgaagaga tgaaaagaag 
taggcacagt gccagatact ttaacatagt 
cctcatatag tttatgatga attcactgtt 
attcccaaag tgttgttgtc atatgtttgc 
cacccatgga gagtcgggtg tgaagatcca 
ggcaggaggg cgggtggttg caggctggct 
gcataatgaa acactgtctt aaacacaaaa 
tttaaaagtc tctggaatgg aaatagctat 
ctcaggtgtg cctgaggact gaacagggct 
tcctcagtgt ctgctcttga cctgttaaca 
tgaggacaga gctgagctat ctacccctgc 
agccctgaag tgataaaacc taaggcagaa 
tcttcctgtg gctggtggga aagagagggg 
tagtggcacg cttccttcat tcccagtgag 
aggtgcctga gctcgctgga gaattcatca 
atcacttctg ttgggtcttt atagagatga 
aatgtgtgta ttcatttagg tcacatggga 
tcacacgggc tttgaattaa actcagtctt 
tgattattta aaaatctccg gagtaatcca 
ctcgggaggc tgagggagca tcgttatcat 



atgtaaccac ttaggctgag gcagaaagat 2700 
acaagacctt tgctcaaaca aagatttaca 2760 
aaggagaaga aaataactgc caggggaggc 2820 
tctcgcacag tggacgcttg tgtctagaag 2880 
ccattcctgg tttatatggc ttgaggccag 2940 
tttatacaga gcatattagc cacatgtggt 3000 
gaccaagaca ggaggattgc cacaagtctc 3060 
tcacccctga cccagtccca cccaacatca 3120 
gtcagaatca cccagattaa agattctggg 3180 
ccagttattt aataattctt atactcatga 3240 
attttaggtt ttataattga agaaataggt 3300 
ggtcatttca gctaaggtca ctccccattg 3360 
ctgggacttt ctagaggttc ccgctgcttc 3420 
tctcctggca ttctgggctt ctctcctgtc 3480 
cctttctctc ccccttctct ctctaaacca 3540 
attttgttcc ctcctctaaa tgagtctgaa 3600 
aacttcatat ggtctgtgag ttgtatcatg 3660 
acttatcagt gagtgcaaac caggcatatc 3720 
gatattttct agttctatcc attcgcctgc 3780 
tgaatagtat tccattgtgt aaatgaacca 3840 
aaatctgggt tgtttccagc ttctaggtat 3900 
acacatatcc ttgaggtatg gtagagcatc 3960 
gggttttcag gtagaactat ttccaatttt 4020 
gacagggccc ctagtggaga gatggggcca 4080 
ttgttcctct ctaaaagaaa tgcagggaca 4140 
taggatccat cctatgggca agcaccaaac 4200 
ttgcagacag gagcttagca tggctgtcct 4260 
cagatgcaga tgccaaccct tgaactgagg 4320 
ggtttgaagg agctgaaggg gatggcaacc 4380 
cagagctccc agagactaag ccaccaacta 4440 
tggcagagga ctgccttgtc tggcctcagt 4500 
ttgatgcccc agggaagggg acaaggaggg 4560 
999ttggggg tgggggtggg gatgtgaatg 4620 
tggtacagca tcctctcaga ggcaaagggg 4680 
gggacgggga aggagggcaa catttgtaat 4740 
tgaagaaaca ggataacttg ggaatggtta 4800 
ttattctgag actcttttcc aataccaagc 4860 
cctttttggc agaagttctt tggactttaa 4920 
cttactaaga tctaatgaaa atcaagatgc 498 0 
aatatgactc tttagagttt tgagacaggg 5040 
ttgtcaaaga tgaccttgaa ctcttaatcc 5100 
accactcctg gcttcatagt gtttttaaaa 516 0 
cacgtctaac ctcagcatct ggtgaatcaa 5220 
ataatatcta agtttcagtt agtaagggct 52 80 
ccaaaaccca tgaaggagat actattgcca 534 0 
cataatctta cctctgagcc agtgtctgcc 54 00 
atgcactcct caggttggaa acattactag 5460 
gctgagtcag ggtctgccct cagctgtgcc 552 0 
agattggaag cattacaggc actcaagatc 5580 
atccaccaag actagcagtg cctccgtgtc 5640 
cagtccttcc ttgatgcaag gtcgtgtgtc 5700 
agcaagtgat cacctgggta aggaaggttc 576 0 
ctcatccatc actctgctcc tgtagacata 5820 
tttataactt tgttgtttat agtttttatg 5880 
ggtacacatt ttcaggtgtc tgtctttcca 5940 
ggttttaccg gctgagccat ctcacctgcc 6000 
ggagtgtggt ttatgattgt agtatcaaca 6060 
gagctccagg ctagttccag gcttgcctaa 612 0 
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gctgtagagc aagtcactct cttaaaaagt 
atctgaaatt ctgtttgcca ataactatga 
ccaagttctc caacgaatta gatcacactc 
agccagtagc atgcgtatat ttgggctcag 
aaataggcta atggctgtgg aatctggtct 
cacgctccct caaattgatt gccttccagg 
gcccaattca aaccctgtga gtgaaaacaa 
cccaaagccc cttctgtcag ggatcccaaa 
ggctggtcct catcgcatac catacatagg 
tgagaggata cccctattgt tcctgaaaat 
ccctctgccc cacaatccag ttaaggcagg 
gccttggatg aagggcaaga tggatagggc 
gtgcctttaa gatacagcct ttcccatcct 
taaccctccc tgtgctcaga cagaaatgag 
ccttgccgcc aacttgtaaa caagagcgag 
gtgagttgtt gaaagcttcc cagggactca 
ctggggaagt atg 

<210> 41 
<211> 1659 
<212> DNA 
<213> Mus sp. 



gcctctccca tatttttgta tataatttgc 6180 
aattattcac attactaaaa tcttcctgtg 6240 
agatgaaatg ctaataaaaa ttaaagctgt 6300 
ggccaacagg caggcgatct gggtgtaaga 6360 
ctagtggctc cgctgagagc tgacctcaac 6420 
ttatgatttc tcatcacagg aaactttgtt 6480 
aaacaggaga gcaagtgctg ctccccgtgc 6540 
tgcaccccag agaacagctt agcctgcaag 6600 
tggagggctt gttattcaat tcctggccta 6660 
gctgaccagg accttacttg taacaaagat 6720 
agcaggagcc ggagcaggag cagaagataa 6780 
tcgctctgcc ccaagccctg ctgataccaa 6840 
aatctgcaaa ggaaacagga aaaaggaact 6900 
actgttaccg cctgcttctg tggtgtttct 6960 
tggaccatgc gagcgggaag tcgcaaagtt 7020 
tgctcatctg tggacgctgg atggggagat 7080 

7093 



<400> 41 

ctcgaggtcc agtatggctt ctcaaccttc 
agtttgaaac agtcttagaa gaaaatgctg 
gagcagtatt ctggtttgca tagaggcaga 
agggcaggga tagagcaaat gatggctctg 
tgagcaaaat ttggcttttg acatctgcaa 
acacatagat atcttaatag tcaaggaatt 
aggggatggt agaaactgca aaaccaatcc 
atgctactag ccacaaaaag agttttaagt 
gacaggctcc aggtctgtag cattagctta 
gctgcattgg ctaaagcaga tctgggaggg 
cctccaggaa aggcaaccct tatttctgga 
cctggccagg ctcttcctta gctcacatca 
catgcttgat tctttctata cctacttcca 
ccgtgtgtgt gtgtgtgtgt gtgtgtgtgt 
tggatgtcag agtttggttc tctccttctg 
catgagcaag caccttgctg cctgctatgt 
ccaagattgt ggaagctgga ctgaagatca 
tttggcacat ttgttgctga tggggagtga 
ggaagtgata gaaatgaaaa catgtatgga 
tgtgtgggtg gccactgttt gctctctgct 
ggcaggtgtg tgtgtgtgtg tgtgtgtgtg 
tcagcagatc tgtcagcttt cccgcttttg 
agctctggaa gacaatgagc agccactttc 
tagtattgac attgctgggg cctaggagct 
accggcgtaa ccttggcaca caggcctggg 
gaggactcca ccctagggac aggagtactt 
aaattcagat cccaatataa aaaaacaact 
aaaaaaaacc agcctcccaa gtaaaacaat 



ttggcaagaa ggctgcaggg acgaccagga 60 
gcttagagac aggtggcaat gggggatggg 120 
gtccttccaa gtgctgggaa acaaggcagg 180 
tatgtgtccc tgttcagttt gcatttaatc 240 
ctcaaaagaa ggtaattagg caaatgactg 3 00 
tttttttttt tttttgaaga gttagcagtc 36 0 
gtattctttc ttgagatttt tagacagttg 420 
gggaggagag taagatgcag gcaccaaggt 480 
cagatgagat tctttacaga gagccaggca 54 0 
ggccaggaga tcagctggcg gcactcccag 600 
attttaaact gataacccaa ttcccaccag 660 
caaacacaga aggattgttt tagatggagt 720 
agaccaattt tataaaagtt tatttaccgc 780 
gtgtgtgtgt gtgtgtgtgc atggtatata 84 0 
cagtgtggct cttagagatt gaactcagat 900 
ccctccagca gtctgaccat gttccttccc 960 
caatctgcca gatgggcaga atctttactc 1020 
atacccatgg ggacatggct gtcatggtgt 1080 
tctgtcacag gagctggtga ggctgatggg 1140 
tgtcacagcc tcttgttcag ggcttgatca 120 0 
tgtgtgtgtg tgtgtgtggt cacacccatc 1260 
ttagagggtg atatcatgct tcctgggggg 1320 
ctctagatac aataggcgga gtcaggaagg 1380 
actcactgct cggtggccgt cagatggtga 1440 
ctgtacaagg cgtctggctg cagggccaaa 1500 
cagacatctg ggaatctggg atgggtttta 156 0 
cccaaacaaa cagcagcaat taaaaaaaaa 1620 
aatggtacc 1659 



<210> 42 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: primer F51 
<400> 42 

cccagtgtct ctgatttagg gagagcacct gag 33 

<210> 43 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer R51 
<400> 43 

ccagactgcc ttgggaaaag cgcctc 26 

<210> 44 
<211> 35 
<212> DNA 

<213> Artificial Sequence 



f}r <220> 



<223> Description of Artificial Sequence*, primer F52 
<400> 44 

cagtgagagt cttctctgtc cctcaatcgg ttctg 35 



*** <210> 45 

ife <211> 26 

M- <212> DNA 

<213> Artificial Sequence 

J, <220> 

<223> Description of Artificial Sequence: primer R52 
<400> 45 

tggatgtgga atgtgtgcga ggccag 26 

<210> 46 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F31 
<400> 46 

aatcaaagag gcgaactgtg tgtgagaggt cc 32 

<210> 47 
<211> 26 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer R31 



PXE-012.US 



14 



<400> 47 

cggctcccca aaatgtggaa gcaagc 26 

<210> 48 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F32 



<400> 48 

gaatccatct tgctccaaca ccccaacatc 

<210> 49 
<211> 26 
<212> DNA 

<213> Artificial Sequence 



30 



<220> 

<223> Description of Artificial Sequence: primer R32 
<400> 49 

cgcctcctct ccccagtctc cccttg 26 

<210> 50 
<211> 4998 
<212> DNA 
<213> Mus sp. 

<400> 50 

tccacccacc tgtttctcac gtccccggcc ttcctagtta acttcatggt taaagaagcc 60 
tcacccgggg agggtgtggt gccacagaag gaagggtgct cccacaagcc cccagtgtct 120 
ctgatttagg gagagcacct gagcccagtg agagtcttct ctgtccctca atcggttctg 180 
aaattcccca cttgccctcc ttatccaggg gacagggctg cccaccctat tcaggacagt 24 0 
agtcttaaac tcgtagccaa cagacttttt attgggctgg gagaaagaga tgaggctcct 3 00 
gaagctcagc cgagtgggct ctgattccta cttctcagag gtcgggcagc ccagccaata 3 60 
ctgagcaatg gagcgtgggt agggaggatt cacagagtcc actcgccggg ttctaaggtt 420 
gactcggtag tatttgtctg aaagaaagaa tggaaaaagg gttatgtgag attctgcctg 480 
atcctgtcca ctggtcccaa gaaggataaa ggctttttct cagaagggaa agtgaacatc 540 
caccaagcag ataatgtcac catctacagg ctgtgttcag cacccaggga ccaagacctg 600 
caggcaaggc ctagccaaaa ccagtctaag gagtagaaag gggctcccac ctccagagaa 660 
gaaatagacg ctctgaatgg gctcgcaggt ggcaggtaca agccagtcca tatcataatc 720 
atagttgttg taggttccta gcccactctc ctcgctggag aacaaagaga accagattga 780 
acgtgatgaa cgacgggagt tcgagctctg gctgcgtctg tggccacgcc ctcggcgtga 840 
acgatagcgc tttcggcttc tacgcttaga cttctgtttt ttggcttggg cagagtggga 90 0 
taaggagcca gtgacgtaga tgcggccggc catagcagcg tccactttcc ctggcacacc 960 
atgccagttc cggctgatga attggggttc tctggctcca tctgtaacag ggaaggggtt 102 0 
aatgcacttg gcagattctg gctttgattt ctccagcaag gttgtctgtc tatctattta 1080 
tctatcttta tctatgtatc tatctatata tctatgtatc tatctatcta tcatctacct 1140 
acctacttac ctatctatgt atctatctat ctatcatcta cctacctact tacctatcta 1200 
cctatttatt tgtttgtttg ttttctttga aacaggatct tagcacctac ctatggctgg 1260 
tttgcaactc actatgaagc cataactggc ctcttaactc acaaagatcc acttgcctgt 1320 
gtctctgagt gctgggatta aaagcatgtg ccactacacc cagctccagt aggaccttta 1380 
gaacacattt gctatgcctt gcctaagaca cacaactcag tccccaggcc ccagcctccc 1440 
tgtctagagc tttttcccat cctctctcca ctgtatccct tgaatctctg ccccatccga 1500 
aacccctcag cgcgcagccc ctccttctgc tgtgttaggc aaagtccaag gtatgggatc 1560 
caaatagagc caagcctcat cccccaaaag tcaacagaag caaagtctag ccagagcaaa 1620 
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cagctcttga tcgatggtgt cacagttcca 
agcccagttt ccagagaaag aagccagcct 
cagaagagga gttcgaaaat gttctcccag 
aacacggctg acagagagct gccttcgcac 
tactcccagt actgcttccc tgaggagcag 
ggcagagagg aatcatggaa tagaacaggg 
ctgagtaccc ttgaagaagt agaccctttc 
gaacgctgca tcaacattgt ctggtatgcc 
accagggtcc aggaccccat cctcaaagcg 
cagaagggtg aggacatacc gctggccaca 
acctgctcct ggagtccctg actgctttgt 
cctttacctt gcctcagact taggtctggt 
ttgatgcgag tgaaggcagc atcgatgggg 
ttggggtacc caggcctcac tgccgtctca 
ggggaaactg tgtgagaagc agatgagcct 
catagagtca cctcggaagg caaagaggga 
gggctttcca ctgcacagtt cttcctctgg 
gggccttagg atctcctcct gttgctccac 
aggatctagg aaggctgtcg gctttagagt 
tggagaggtg ttctcgggtt gcacaccggt 
gtcatagctc caataatcat cctctggcat 
cagaacgggg agcagtgagt gtcaggctgt 
tctgaactca ccttggggct tgcactgctc 
agtgcaaagc tcgtcacact gacacttctt 
cttgcatgac tctatgggag ggaatatcag 
acctgcactt ccctaggtac ccaccaatcc 
catgccacca gggctagtat gaaaaagggc 
agggccttgg caagctgggc gcggagcttc 
agcagactga agaagagttc ctagttccct 
ggcccagccc cattgccctc ctccaaacac 
cccagcccca ggagagctgg gaaacagaaa 
ac 999 ca 99 a gggataacac gcttagctta 
caccctctga gccagtcccg ttaatctccc 
ggtgccttgc ctcatcaggt gttgagagat 
tgcctggagt agcttccaac tcattcccat 
ctaggtgctt gtcccatcct accccccgct 
ggttggggga gagctggcaa gcactttggg 
aaagcttcat ttctggcctc tccttgttct 
agtgggtcag agtctattct tctttcttta 
gtgtataagt gtctgctcac atgtgcatct 
aggtcagaag agggctttga ataccctgga 
tgtggatgct gagaatcaaa cccaggtcct 
ccatctttcc agtcccagag cccattcctg 
gaccaccctg gccacacctt caatgacctc 
ggcatacttt ctagactcac atactaagtg 
gtagagtgcc aggttttggg ccaaattcca 
tttctgttct gtaatcacag gcgagcgtgc 
agtctcagcg gcaaaatgaa acactaaatt 
ggaaaccggc attaaagggc tttaagaatc 
gacgtggata catgtagcca gcttgcttcc 
aatggaagac agctctttac agccctttct 
tggggagagg aggcggagcc aggtgtgggc 
tgcgtcgggg gcggagcccg tgaaacctag 
ctcagaggcg tggttgctgt tgagcatctt 
tttgttccgg gcaccggcgt ctctaccctc 
catgcccttc ctaagtcgct gagtcccgga 
tagcccagca cctttacc 



ggcccctccc ctggaagccc ccactatcac 1680 
tgctctccct ccataccaga ggatctgccc 1740 
ctgtcccgct gaagcaaggc aaagtgctca 1800 
tcctcctggc tgggttgctg ctgaaattcg 1860 
aacagctggc atcaggagag atctgaccaa 1920 
actccaccac ctgccccctt ctcctccacc 1980 
ccggccactg taacggtggg caggaagggc 204 0 
actgaagcct tcggagatgt ttcggggata 2100 
ccagtactga ctaccctgaa agacagagat 2160 
gaagcagtcc tatatcctaa actggctgtc 2220 
cttcacagct ccccagcacg tccatggcac 2280 
accttgaaca agtaggtctt cccctgacag 2340 
ccctcaatgc cccagacatc ttggataagt 2400 
tctagctcat agcagtactg ccctagaaca 2460 
aaggcagatc cgaccgccac cagacctgtc 252 0 
cccattcttg agatccgtga aggcgtcaaa 2580 
aaactcaggg gtcccttgat cagtggtgtc 2640 
tttaggcgct ggggtgcttg gctgttcctc 2700 
gccgtccgtc cgaggattta ggtcaccggg 2760 
gttggtattg ttcttgggct cctccacgta 2820 
agtgaacacg tccccccgcg ttactgcagg 2880 
ggagggagcc ccaggcccac ccaccagggc 2940 
catgtagtcg gcacagcagc tctgatagta 3 000 
gctggccatg aaaccctgag tgcagcggcc 3060 
gtttacagcc caatctaggg cacctgccca 3120 
cctcccacac cttggtcagc cagagaaacc 3180 
ctcaggggtg ccatggcagg cctctagccc 3240 
tggaatctcg ctgtcctgcc tgaaaaaaga 33 00 
gggtttctgc cctttatttg ctcatcctct 3360 
agctgcagca aagggtcaca ttcccagaac 3420 
accctcgcca agaccaaagt cagtagggtc 3480 
gctggggagg tggaaagaag catgtgttgt 3540 
tgagccttac tttttataaa gtgggaccat 3600 
tccgtgagct agaacagaca aaacgtttcg 3660 
aagccgttat cgatttactg tttgatcagg 3 720 
tcgaatctgg atttttgggg caagaagggg 3780 
ggaggttttc ttttcttctc ataaaagaac 3840 
ctctaagctg ggtgttacag cataggaagt 3900 
ttttttttag atttatttat tttatgtttt 3960 
gtgcaccaca tgcatgtctt gtgtctatgg 4020 
actggagttt tgaacagtta tgagctgccg 4080 
ctgtaagaac aagtactctt aaaggctgag 4140 
aggctttcac taatccattg atcctcgggg 4200 
atttatttta aaaaaaaaat ggactcattg 4260 
ggatttctct ataaagaagt gctcactggg 4320 
agcactggca cacttctgaa gcccctccgt 4380 
ctttggtgtc tcttctctat ggaccgcagt 4440 
ttactcccta cagacgcgtg aagcctaagt 4500 
tcaactgcga ttctttaacc atccggaggg 4560 
acattttggg gagccgagcg agcggtagga 4620 
acagcatctt gcacaccacc aaggggagac 4680 
gtggctggag acctggggta ggcttgcgcc 474 0 
a 99cggggcg tcaaatcctt gactctgctg 4800 
agctccgctg tgcttagatt ggagcagcgc 4860 
ccgcgtctgg tccatgcttc tctctccctt 4920 
gctgccctcc tccttctgct tctacacttg 4980 

4998 
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<210> 51 
<211> 11176 
<212> DNA 
<213> Mus sp. 

<400> 51 

gcagctgggc aaacgttggc gatgccggtg 
tgagaacttt tagccgaaag ccggctccct 
aagaaaaaaa aaattccaga gaagcttcca 
aggactgcaa gtccgcagtc accctccacc 
acgctgcctc cgcctcctgc cgaacgtaac 
ggaatcctcc gtctgacgcg gggcacgcac 
tgacgtccgg gcacgttcta ttttggaacg 
cgtggctttg tgattggctg tcgcgcgcag 
gtagagcgta gctcccttcc ttgctttttg 
ggagagctag gattcttgtc gcgatcggga 
tgtgtggacc tggtctgttg tcataagcta 
agggggaact gaaggcctca tccttctcag 
cactcagtcc ttccgaggtg ttcaaacact 
gtggtctcta aaaggtctgc cttcccttag 
cccttccctc gccacgcccc ctagagtagt 

gggggggggg gggcgtgatg gacgcttctt 

ctgcaagaca gtctgagaga ttctcgctgt 
cctgtcagtc tcactgggaa gagacagaca 
ggtcccccaa caccgttgga actgggatct 
gtgtgtttgt gtgtgagtgg agaggaaggc 
tgtggtgggg gttggggggt tttggctgta 
gggagtttgt caccaggttc tgtccagcct 
gtcaccaacc cggggtgtga ttcaccaccc 
gaaggaggag gtagaaggca gttgaacaga 
gtggaagggt gggtgttgtg gctttttgcc 
tgtgctcact cacagggtcg gtctctctta 
ttgtttgtgt gtctacgcct gtgtgtgtat 
gggaaatgcc cggctccttc gtgccaacgg 
agtggctcgt gcaacccacc ctcatctctt 
cctcccagcc tccagctgtt gacccttatg 
gcctgagtgc ctacagcact ggcggggcaa 
ccaccagtgg acctgtgtct gcccgtccag 
agacagtaag tatgaggcct caggagttgg 
cagtttgtac agtgccttgc tgccatgcat 
tggttatgca gacctgtaac cccagctctc 
gaggccagcc tgtgctactt atggagtcca 
aagttggcct tggggggagg tgggtgaggg 
cttaatagtt ggaggttcct ctgaggcctc 
gtgaggagta ggggttatta tttggggttc 
gaggtacccc cagatctcat ggtccttatc 
aaagcgaagg gttcgcagag agcggaacaa 
gagggagctg acagatcgac ttcaggcggt 
tgctgggagc actctgcctt gttcttcccc 
ggaaaccccc tcttagggaa caggggtcag 
gctcagaccc atgcccactt actttcgact 
atgtcaccct cctggctttc tctcagccta 
accttctttt cttcactaaa taataatcca 
tgagctgggg atctacctgt cgtagttcag 
ttcagccctt ggctgagatg ccatcatcct 
taagtcaatt ccttgtctgc tacttcagct 
gcgcccacca agcccacttc tttctctctt 
aacttcatgc ctgccccttg aaaccagggt 



caaagtatat acccggtggt tagcagaagc 60 
aagccgaagc taggcaagta ggggaagaaa 120 
gagcctcctc ctcttccctc ttccttcaaa 180 
cagcaagagt tagggcctcg aaccccggtc 240 
gggggacccg tgcgtaaagc gtgacgcgct 300 
aggccgcagc ccctccgccc gccccgcccc 360 
ccgaggccac gttgctaagg gagggggcag 420 
ctttagccaa tcagcgttcc cttcctattt 480 
tggttcttcc cgtgctgggg gtctccaaga 540 
ctcgttgtca ccccatggtc tgcgaggact 600 
gaggcttttg gctgagtgtt agcgcctcta 660 
gcacacatat acgtgctcct gagctctaga 72 0 
agatgagcta gcctacggag aggcagccag 780 
ttcccaggct ctgattggcc agggattcag 840 
taagcctcta ggattccact tgcgggaagg 900 
ggggacgcag atcctatgtc accccatccc 960 
cacttttctc tgcctatcag ttcactgaaa 1020 
ctcggaaggg atgctctcaa ctcttaggcc 1080 
ccgcctgcgg gagccctcat gcagtggggg 1140 
ttggctaagg cctctccctc tccctccctc 1200 
tgtgtgtgtg aatgtctgtg gctccatccc 1260 
cctctcccac ccaccccccc acacctaaga 1320 
gctggaaccg tgcaaccttt ccccgaggaa 1380 
atctctcatt aaccactgcg tcacggtgta 1440 
tgtgacacac acatccacac ccgctcaccc 1500 
tctctcttgg gcgtgtgtgt gtcggtggct 1560 
gtctcacccc gtaggagtgc gccggtctcg 1620 
tcaccgcaat cacaaccagc caggatcttc 1680 
ccatggccca gtcccagggg cagccactgg 1740 
acatgccagg aaccagctac tcaaccccag 1800 
gcggaagtgg tgggccttca accagcacaa 1860 
ccagagccag gcctagaaga ccccgagaag 1920 
gatggaggag cctagctagg gatgtgggct 1980 
gaagatccct agcacagcat aagccaggag 2040 
agaaggtgga ggcaggagga gcaggagttc 2100 
gcctgcactg caagagatca ttattttcaa 2160 
aagtaagaga aagtgacagt aattttgtca 2220 
aagtctgaag gaactttacc attctggcca 2280 
aggaggaagg aagttttctt agggctgata 2340 
tctgactcag cttaccccag aagaagaaga 24 00 
gctggctgca gctaagtgca ggaaccgtcg 2460 
aaggaggagt ctgggggtgt cttgaggccg 2520 
cgtttctcac tgtgcctgtg tcctaaacga 2580 
tataggctga tggagtggct ccatatgcat 264 0 
gttccccact ttccctgaat atgtccccac 2700 
aggagacaag ctagaggagg taattctctc 2760 
ttttgccttc ctgcctccat ttttttttcc 2820 
ccctcctccc ccaacttgat agcctcaagt 2880 
gactggctct ggctggaaac tattttgtgc 2940 
atctacagty ctgccgaact tgagctggtg 3 000 
ttttacctca gtgcaacccc ccacacacaa 3060 
gcgtctctga ctccccgtcg ggaggctgaa 3120 
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ggagatgggt aacagaacct cattaaaaac 
aactgtagtg tttttctttt ttcctctcaa 
gcttatgttt gagtgagtgc tggtgcacca 
tcatagtttg ttctctcctt ccgtgttgtg 
gagctacaat gcccccttct gccctttaag 
ttcctcggcc tctcaaagtt gagattacaa 
ttgcctatca gtgacgtcca ctcctgccta 
gggaaaccga ggcacgagta gcatggtcta 
cagttgggag ggagctgtcc agccccctgg 
tgggcgggtg aagctactct gtgtggtcgc 
atgacctggc cctgctggga tccattagga 
gctggagtcg gagatcgccg agctgcaaaa 
ggcccacaaa ccgggctgca agatccccta 
cgaggtgaga gatttgccag ggtcaacatc 
gccgccccct ccaccaccgc ccctgccctt 
gacggcttct ctctttacac acagtgaagt 
tagcccttcg tacacttcct cgtttgtcct 
cgcccaacgc accagcggca gcgagcagcc 
tgctctgtaa actctttaga caaacaaaac 
agatgaggag gagaggggag gaagcagtcc 
ctgtctgacc acctgccgcc tctgccatcg 
gtgctctgtc tctggttttc tgtgccccgg 
agggggtggg gcggggatga acacccctcc 
acttctgggg atagatggct gactgggtgg 
gtcttacgtg aggctggagg ggaaagagtg 
cgagctggca tgcacctcca gagagaccca 
cttttccccc acccacccat ccaccctcaa 
tgctccctcg ggccttagct gattaactta 
gacgaattga gcccccgact gagggaagtc 
cttcccgctg attccaaaat gtgaacccct 
aaactggctc aggttggatt tttttcctcg 
ccgctcccac ccctgtgcag tattatgcta 
cgcccttggc cgtcctcgtt gggccttact 
ccatcttgct ggagcgcttt atactgtgaa 
gggattgacc cccagccctc caaaactttt 
tccctcccct tgacagggag ttagactcga 
tcttgctcag gccccagact ttttctcttt 
caacttctcc ccaccctggg agccccgcat 
gaagttttca gggctgaggc tttggctccc 
ttttggacta gcatacttaa gagggggctg 
ttcagtccca aagacgagtt ctgtcccttc 
gagtcagatt tctattttct aatattgggg 
gcatggaaca ttccataccc tgtcctgggc 
cccccagcta tttatccctt tcctggttcc 
taaatatatt atatatgagt gtgcgtgtgt 
gagcttcctt gttttcaagt gtgctgtgga 
gactttctgg ctgtcccttt ttgtcacttt 
ggagacagtc ccggcctctc cctttatcct 
caacatgtct ccactctcaa tgactctgat 
cggggacatg caattttact tctgtaagta 
ctatatcgtt gagaattctg ggtggaaatg 
accacaattc attgactcca tagccctcac 
gtttttttgt attttgcacc tgaccccggg 
cccctccccc ccttggttct gcactgtcgc 
tcaggtcaaa gtgtctgttt tccctggaca 
gagtttggat tgctagggaa gtcttgctgg 
ctacaacggg actaaaagga agtggagact 
ccacctactt gaaaaaataa ggggcggaaa 



aacacataag cattacctac tgactcaaca 3180 
aaaattattt cgtttgttta tttattattt 3240 
cagcacacat acgaggtcag agggaaattt 3300 
ggtgcttgct ggcaatctcc ttcactcagt 3360 
gcagagtact ccttagtaca gggggaccct 3420 
atgttcacca tcacaccagg cttggagttc 3480 
gcttcttccc aaccatcttt tagtctgatg 3540 
ccaggatttc ctcttagggg acggtcccct 3600 
atcagcagca agaatgtatg agtgtggggt 3660 
tgaccagcaa ttctcctttc tctgtctcct 3720 
aactgatcag cttgaagagg aaaaggcaga 3780 
agagaaggaa cgcctggagt ttgtcctggt 3840 
cgaagagggg ccggggccag gcccgctggc 3900 
cgctaaggaa gacggcttcg gctggctgct 3960 
ccagagcagc cgagacgcac cccccaacct 402 0 
tcaagtcctc ggcgacccct tccccgttgt 4080 
cacctgcccg gaggtctccg cgttcgccgg 414 0 
gtccgacccg ctgaactcgc cctcccttct 4200 
aaacaaaccc gcaaggaaca aggaggagga 4260 
S^SStgtgt gtgtggaccc tttgactctt 4320 
gacatgacgg aaggacctcc tttgtgtttt 4380 
cgagaccgga gagctggtga ctttggggac 4440 
tgcatatctt tgtcctgtta cttcaaccca 4500 
gtagggtggg gtgcaacgcc cacctttggc 4560 
ctgagtgtgg ggtgcagggt gggttgaggt 4620 
acgaggaaat gacagcaccg tcctgtcctt 4680 
gggtgcaggg tgaccaagat agctctgttt 4740 
acatttccaa gaggttacaa cctcctcctg 4800 
gatgccccct ttgggagtct gctaacccca 4860 
atctgactgc tcagtctttc cctcctggga 4920 
tctgctacag agccccctcc caactcaggc 4980 
tgtccctctc accctcaccc ccaccccagg 5040 
ggttttgggc agcagggggc gctgcgacgc 5100 
tgagtggtcg gattgctggg cgcgccggat 5160 
cctgggcctc cccttcttcc acttgcttcc 5220 
aaggatgacc acgacgcatc ccggtggcct 5280 
aagtccttcg ccttccccag cctaggacgc 5340 
cctctcacag aggtcgaggc aattttcaga 5400 
ctatcctcga tatttgaatc cccaaatagt 5460 
agttcccact atcccactcc atccaattcc 5520 
cctccagctt tcacctcgtg agaatcccac 5580 
agatgggccc taccgcccgt cccccgtgct 5640 
cctaggttcc aaacctaatc ccaaacccca 5700 
caaaaagcac ttatatctat tatgtataaa 5760 
gtgcgtgtgc gtgcgtgcgt gcgtgcgtgc 5820 
gttcaaaatc gcttctgggg atttgagtca 5880 
tttgttgttg tctcggctcc tctggctgtt 5940 
ttctcaagtc tgtctcgctc agaccacttc 6000 
ctccggtctg tctgttaatt ctggatttgt 6060 
agtgtgactg ggtggtagat tttttacaat 6120 
tctgatcagg agaagggcct gccactgccg 6180 
ccaggctgta tttgtgattt ttttcatttt 6240 
ggtgctgggg cagtctatca ctgggcagct 63 00 
caataaaaag cttttaaaaa actgtatcct 6360 
tctactacat ggcttccttt cagaaaaacg 6420 
cacttagtgg gacgcctaac gaatcagaac 6480 
tgctaggttt tcccatgttc ccaggctggg 6540 
agtgtaaggt accaaatttg gtgaagggtc 6600 
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tgggagaatt tcatgatcgg aaaagaattt 
caacagttaa gggcaagggt gtaaaagctg 
99tggaggca aggggatcaa ctggtggagt 
gcaaagatct gctatgggga gagggcttgg 
agggtagtgg agggcaagtg gagagtgaga 
atgcagagtt cagacgctcc ctttgaaagc 
gaaggttaga gttaggtggt ctcttctagc 
ttcaagaagg atcgagaatg gaaagcagag 
gcagaacaca tttctcttct ttaatagcaa 
gatgctcacc agtcgggtgg tctagggggt 
ctcggttccc ccattctcgc tcttctgtca 
gccaccctgc tcaggactcc ttgtgagacg 
aattgcctgg tgccgtgctc tctccactgg 
gtgtgatcac aactctatcc ccattactgc 
gaacaaatga acaaaaatga ctaataatgc 
gactggatga gcaaatcgtt taaggagaga 
ttaaatccat ctttgagatg catttggtcg 
atgaagagag aataaatgag aataggggtg 
cttttgtaca agaatgtgaa ttgcagggag 
gtggaattga accactctgt cgctaaacag 
ctgaggatca tccgggcgaa aggagctatt 
ctacttttta cacttatggt cattatttgt 
ttcgaacatt ttttttcact ttttcttgtg 
gaatgactct actaactaaa aagtaagtag 
aagcagcgcc taaagtgcct gtctccctaa 
gtcagtttta tttttgtttg tttgtttgct 
cggcccattc gtagcagaat tcgagatttt 
tctgacggca cgcggccgca gagcgacacc 
agggaatcta ctgagtccct aggtgatgga 
gctaaaaacc atcccttttt atatttatgt 
tggataatac cacagccgag ttcttgtagc 
ttgtggtttt aatatgcttg gcccaggggc 
agcggaggtg taccttgtta gagaagtgtg 
tgctcaagtc tggccagtgt gatcctggct 
cttcggatca aggtgtagaa ctctcagctc 
gctttgcttc tttccatgac gataatgaac 
agttacatgt tttcttttat aagagttgca 
tatgtatgta tatatatata tatatataaa 
tgaaattcac tatgtagccc aggattgcct 
tcccaatggt attacaggca tgagtcacaa 
agaagacaga aaatcagagt tcctttacct 
tcgctccata aacagcccta ccccaccctc 
ctcacaggca cactcctcct tggttaatct 
ccatgtggcc caaagcctct catcctgttc 
acctgacccg gtttaaatat tagaaaaggg 
caccatatgc ttgtcactta ctacctgact 
tctctgagcc tcagtttccc cacctgcata 
tgtggttctc ctcattccta gtgctgggac 
ccccaccacc accataaaat tatttccatt 
ttatgaatag taatgtaagc atttgtgttt 
gtcattccac cccaaagggg tccccaccac 
cagggaccat ggattaacac ttgggtcgac 
ctaatgacag ctacatcaat ttctgaaatt 
gtgtgtgtgt gccctgagtc gggtgctgag 
cattactcac cagaactctc ccctcacctg 
gtggcggtgg caatagcagc aacagtgaac 
tacaaatatt gcagttttga agttggggtg 
gaagagaggc cccttggtct tgcaaacttt 



attcaccttg ggtgtgcaat gaactttcag 6660 
ggcacaactt gtaaatccta gcatttgaga 6720 
tcagtgtcat gtggatcgta gataccaagc 6780 
tacaccaggg gagccagaag tttcgtggtg 6840 
gttagcctca gggagattct acaggcaatg 6900 
actagagagc cgcagcaggt tttgagcaga 6960 
ccatcccagg ctgaggagga cgctgagggt 7020 
gagaagaagg atccaagagg catggaggag 7080 
gcctggaaag gataacttgc tgcaggagga 7140 
tcttggaaaa gagaaggcat ttgctcaagc 7200 
gcttgtcttc cattaagtgt gtgtctcaag 7260 
accttctatg ctcgagttca ttaaaaacac 7320 
ctcagttacc tcaaaagacc agggctaaag 7380 
tccaacgcag agacaggact gagccggagt 7440 
atgcgtgatt aaatacataa aagagcagat 7500 
cagcaagatc ctagaatttt ggagactaat 7560 
gaaattcctg ggaggaaaaa aagtgtaaat 7620 
gcttcagaga ggttaactgc gcgctggtcg 7680 
caaaatggga tagatactcc cgcccgaaag 7740 
ctacaggttt gaagcctgca ccccagacca 7800 
ttcagttagt tatataaagg cgagatacta 7860 
ggtatacagt agataattaa tttcaatggt 7920 
aacatgtgtt tcctcagtaa agtgttccgt 7980 
cttcatttgc atagcgcctt gcattttggg 8040 
ctaaaagcag aattttttgc aaagtgaaaa 8100 
tgtttgtttt taatggaaaa acttctcacg 8160 
ctgcaagcga gaagcaagac tttcgtaggg 8220 
tgccgttgct ttatagaact gcaagtatgt 8280 
gttgacaacc aactcccctt gagtttagac 8340 
gattagccca gggaaactaa ggctcagaca 84 00 
ccaactccct aggggaaatg aaacctacag 8460 
agtggcccta ttggcaggag tggccttatt 8520 
tcacttggag gcgaggtttt gaggtacgta 8580 
gtctgcagaa cgtggtctcc ttctggctgc 8640 
cttctccagc accatgtctg cctgcttaat 8700 
tgtgcctctg aaactgtaag tcagcccccc 8760 
tatatatatg tatgtatata tgtatgtata 8820 
cagggtctca ctctttagct ctggctggcc 8880 
gaactttgaa gcaatcttcc tgcctcagcc 8940 
caagccattt aaatcttatg atgacttata 9000 
agttcacaga tccctacaat ctaacctcgt 9060 
ctggaactgc tttgaggaat gctgcaggct 9120 
cttcagcctg gttgccttcc ccccccatgt 9180 
tcaaatacca ctagctagta aggctccccg 9240 
tcactttctc cctgccacag acaaccaaac 9300 
atgaaggtta atagatgtct tcacaacctt 9360 
atgcatctga gacacagaat tccctagagc 9420 
cctttaatac atttcctcat gttgtggtga 9480 
gatacttcat aactgtaatt ttttctattg 954 0 
cccagtgatc ttagatgacc ctgtggaaga 9600 
aagttaagaa ttcctgccat agaggaatca 9660 
ttttgggctg ccttctggga ggcgctagag 9720 
ttgtgtgtgt gtgtgtgtgt gtgtgtgtgt 9780 
ataggccagt ggctttagtg ttcctggacc 9840 
attctttgat gtgaacacta tgtcttcata 9900 
taaattttaa aagtagaact cagctggaga 9960 
gattgtctaa taacttaata acataaccca 10020 
atatgcctca gtacagggga acgccagggc 1008 0 
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caagaagtgg gagtgggtgg gtaggggagc agggtggggg gagggtatag gggactttcc 10140 
ggatagcatt tgaaatgtaa atgaagaaaa tatctaataa aaatttgaaa aaaaatgtta 10200 
ccccagtttg gcctggatct cactacctca accagactgg catgtgactc tgctgagatc 10260 
tgcctacttc tgcctcctgg gtgcagaaga caatttttgg aagttagttc tcttcttcca 10320 
tcttgtggat tccagggatt gaactcgggt catcaggctt ggctgcaagt gacttactta 10380 
ggtgtctccc agaccctctc ggtttgatta gttagatgct gcacttcatg cctgactttc 10440 
gcactatgta gatagagcaa tgtctataac atctcctaca atgatatgta tatcaagagc 10500 
caagtgatga gatggctcag tgggtaagag cacagactgc tcttccaaag gtcccgagtt 10560 
caaatcccag caatcacata gtggcttcca ttccctctta tggaatgtct gaagactgct 10620 
acagtgtact tacatataat aaataaataa atcttaaaaa aaaaaaaccc agccgggcgt 10680 
ggtggcgcac gcctttaatc ccagcacttg ggaggcagag gcaggcggat tcctgagttc 10740 
gacgccagcc tggtctacag agtgagttcc acgacagcca gaactacaca gagaaaccct 10800 
gtctcgaaaa aaaaaagaga gagagggaag tgagagcgca ataatcttaa catttctgtg 10860 
gttgtctttg ctgtagtcta ttctgataag caatgctggc ttgctcccaa ggtaggaagt 10920 
aacatttctt tataaaaggt atttgctctg ctttattttt ctgttttatt tatggtgctg 10980 
aggatggaac ccaggaccct tggcaagcaa ggctagctgt ttaccactga gccatactcc 1104 0 
agccttgcac tgggggattc taggcaaggg ttctaccact gagccacact ccccaccccc 11100 
atccctctct ggaagattct aggcagttcc atacctagcc tttgatcttt taagacggtc 11160 
ttactagagc tcagtt 11176 



